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PREFACE  TO  THE  TENTH  EDITION. 


That  a  new  edition  of  An  Introduction  to  the  Theory  of  Statistics 
should  again  be  called  for  within  three  years  of  the  issue  of 
the  last  is  satisfactory  evidence  that  the  work  continues  to  hold 
its  own  against  its  numerous  younger  competitors.  At  the  time 
of  going  to  press  negotiations  are  in  progress  for  the  issue  of  a 
Spanish  edition. 

The  attention  of  the  student  is  directed  to  the  Supplements 
at  the  end  of  the  book,  in  which,  to  save  expense  in  revision,  all 
new  matter  has  been  incorporated.  In  particular.  Supplement 
II.  gives  the  direct  proof  of  the  formulae  for  regressions,  which, 
for  the  student  with  some  knowledge  of  differential  calculus, 
will  be  preferable  to  the  indirect  deduction  of  Chap.  IX.  Supple- 
ments III.  and  IV.  deal  with  important  subjects  not  covered  in 
the  body  of  the  work.  The  additional  references  on  pp.  390 
et  seq.  have  been  revised  to  date  for  the  present  edition,  but 
readers  must  bear  in  mind  that  this  revision  is  necessarily 
closed  some  months  before  the  book  finally  goes  to  press. 
With  the  growth  of  the  '  literature '  bibliography  becomes 
more  and  more  difficult  and  laborious  :  the  time  almost  seems 
to  have  come  for  the  publication  of  some  periodical  index 
giving  brief  abstracts  of  papers  and  short  notices  of  books. 

All  new  matter  in  the  present  edition  has  been  duly  incor- 
porated in  the  index,  which  has  been  revised  extensively. 

The  present  edition  marks  an  epoch  both  for  the  author  and 
the  book.  At  the  end  of  the  last  academic  year  I  resigned  the 
teaching  post  which  I  had  held  since  1912,  feeling  that  the  work 
now  calls  for  a  younger  man  and  a  better  qualified  mathe- 
matician.   As  for  the  book,  it  has  now  come  of  age,  the  first 
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edition  having  been  published  in  1911.  Over  10,000  copies  in 
all  have  been  sold,  the  success  of  the  work  having  far  exceeded 
expectations.  One  element  at  least  in  that  success,  in  my 
behef ,  is  the  fact  that  the  book  was  definitely  founded  on  experi- 
ence, personal  experience  in  statistical  work  and  personal 
experience  in  teaching  ;  and  the  same  may  be  said  of  later 
additions  and  revisions.  I  make  this  statement  partly  in 
self-defence.  Correspondents  have  frequently  requested  me 
to  include  chapters  on  subjects,  e.g.  technical  applications  of 
statistical  method,  of  which  I  know  nothing  and  have  no 
personal  experience  :  my  reply  has  always  been  a  refusal. 

But  the  success  of  a  book  is  certainly  due  to  its  publishers  as 
well  as  its  author,  and  I  am  glad  to  give  them  my  acknow- 
ledgments :  it  is  very  pleasant  to  look  back  on  our  friendly 
relations  over  so  many  years.  To  my  readers  also  let  me  take 
this  opportunity  of  sending  my  greetings  and  thanks  for  their 
kindly  judgment. 

G.  U.  Y. 

May  1932. 


PREFACE  TO  THE  FIRST  EDITION. 


Thb  following  chapters  are  based  on  the  courses  of  instruction 
given  during  my  tenure  of  the  Newmarch  Lectureship  in  Statistics 
at  University  College,  London,  in  the  sessions  1902-1909.  The 
variety  of  illustrations  and  examples  has,  however,  been  increased 
to  render  the  book  more  suitable  for  the  use  of  biologists  and 
others  besides  those  interested  in  economic  and  vital  statistics, 
and  some  of  the  more  difficult  parts  of  the  subject  have  been 
treated  in  greater  detail  than  was  possible  in  a  sessional  course 
of  some  thirty  lectures.  For  the  rest,  the  chapters  follow  closely 
the  arrangement  of  the  course,  the  three  parts  into  which  the 
volume  is  divided  corresponding  approximately  to  the  work  of 
the  three  terms.  To  enable  the  student  to  proceed  further  with 
the  subject,  fairly  detailed  lists  of  references  to  the  original 
memoirs  have  been  given  at  the  end  of  each  chapter :  exercises 
have  also  been  added  for  the  benefit,  more  especially,  of  the 
student  who  is  working  without  the  assistance  of  a  teacher. 

The  volume  represents  an  attempt  to  work  out  a  systematic 
introductory  course  on  statistical  methods — the  methods  available 
for  discussing,  as  distinct  from  collecting,  statistical  data — suited 
to  those  who  possess  only  a  limited  knowledge  of  mathematics  : 
an  acquaintance  with  algebra  up  to  the  binomial  theorem, 
together  with  such  elements  of  co-ordinate  geometry  as  are  now 
generally  included  therewith,  is  all  that  is  assumed.  I  hope  that 
it  may  prove  of  some  service  to  the  students  of  the  diverse 
sciences  in  which  statistical  methods  are  now  employed. 

My  most  grateful  thanks  are  due  to  Mr  R.  H.  Hooker  not  only 
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for  reading  the  greater  part  of  the  manuscript,  and  the  proofs, 
and  for  making  many  criticisms  and  suggestions  which  have 
been  of  the  greatest  service,  but  also  for  much  friendly  help  and 
encouragement  without  which  the  preparation  of  the  volume, 
often  delayed  and  interrupted  by  the  pressure  of  other  work, 
might  never  have  been  completed  :  my  debt  to  Mr  Hooker  is 
indeed  greater  than  can  well  be  expressed  in  a  formal  preface. 
My  thanks  are  also  due  to  Mr  H.  D.  Vigor  for  some  assistance 
in  checking  the  arithmetic,  and  my  acknowledgments  to  Professor 
Edgeworth  for  the  example  used  in  §  5  of  Chap.  XVII.  to  illustrate 
the  influence  of  the  form  of  the  frequency  distribution  on  the 
probable  error  of  the  median. 

I  can  hardly  hope  that  all  errors  in  the  text  or  in  the  mass 
of  arithmetic  involved  in  examples  and  exercises  have  been 
eliminated,  and  will  feel  indebted  to  any  reader  who  directs 
my  attention  to  any  such  mistakes,  or  to  any  omissions,  am- 
biguities, or  obscurities. 

G.  U.  Y. 

December  1910. 
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THEORY  OF  STATISTICS. 


INTRODUCTION. 

1-3,  The  introduction  of  the  terras  "  statistics,"  statistical,"  into  the  English 
language — 4-6.  The  change  in  meaning  of  these  terms  during  the 
nineteenth  century — 7-9.  The  present  use  of  the  terms — 10.  Defini- 
tions of  "  statistics,"  "  statistical  methods,"  "  theory  of  statistics,"  in 
accordance  with  present  usage. 

1.  The  words  "statist,"  "statistics,"  "statistical,"  appear  to  be 
all  derived,  more  or  less  indirectly,  from  the  Latin  status,  in  tlie 
sense  that  it  acquired  in  mediaeval  Latin  of  a  political  state. 

2.  The  first  term  is,  however,  of  much  earlier  date  than  the  two 
others.  The  word  "statist"  is  found,  for  instance,  in  Hamlet 
(1602),^  Cymheline  (1610  or  1611),^  and  in  Paradise  Regained 
(1671).^  The  earliest  occurrence  of  the  word  "statistics"  yet 
noted  is  in  The  Elements  of  Universal  Erudition,  by  Baron  J.  F. 
von  Bielfeld,  translated  by  W.  Hooper,  M.D.  (3  vols.,  London,  1770). 
One  of  its  chapters  is  entitled  Statistics,  and  contains  a  definition 
of  the  subject  as  "The  science  that  teaches  us  what  is  the  politi- 
cal arrangement  of  all  the  modern  states  of  the  known  world." 
"  Statistics "  occurs  again  with  a  rather  wider  definition  in  the 
preface  to  A  Political  Survey  of  the  Present  State  of  Europe,  by 
E.  A.  W.  Zimmermann,^  issued  in  1787.  "It  is  about  forty 
years  ago,"  says  Zimmermann,  "that  that  branch  of  political 
knowledge,  which  has  for  its  object  the  actual  and  relative 
power  of  the  several  modern  states,  the  power  arising  from  their 
natural  advantages,  the  industry  and  civilisation  of  their  inhabit 
ants,  and  the  wisdom  of  their  governments,  has  been  formed,  chiefly 
by  German  writers,  into  a  separate  science.  ...  By  the  more  con- 
venient form  it  has  now  received  ....  this  science,  distinguished 
by  the  new-coined  name  of  statistics,  is  become  a  favourite  study 
in  Germany"  (p.  ii) ;  and  the  adjective  is  also  given  (p.  v),  "'To 
the  several  articles  contained  in  this  work,  some  respectable 

1  Act  v.,  sc.  2.  2  Actii.,  sc.  4.  3 

*  I  cite  from  Dr  W.  F.  Willcox,  Quarterly  Publications  of  the  Ajnerican 

Statistical  Association,  vol.  xiv. ,  1914,  p.  287. 
'  Zimmermann's  work  ap})ears  to  have  been  written  in  English,  though  he 

was  a  German,  Professor  of  Natural  Philosophy  at  Brunswick. 
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Statistical  writers  have  added  a  view  of  the  principal  epochas  of  the 
history  of  each  country." 

3.  Within  the  next  few  years  the  words  were  adopted  by  several 
writers,  notably  by  Sir  John  Sinclair,  the  editor  and  organiser  of  the 
first  Statistical  Account  of  Scotland,'^  to  whom,  indeed,  their  intro- 
duction has  been  frequently  ascribed.  In  the  circular  letter  to  the 
Clergy  of  the  Church  of  Scotland  issued  in  May  1790,^  he  states 
that  in  Germany  *'  'Statistical  Inquiries,'  as  they  are  called,  have 
been  carried  to  a  very  great  extent,"  and  adds  an  explanatory 
footnote  to  the  phrase  "Statistical  Inquiries" — "or  inquiries 
respecting  the  population,  the  political  circumstances,  the  pro- 
ductions of  a  country,  and  other  matters  of  state."  In  the 
"  History  of  the  Origin  and  Progress  "  ^  of  the  work,  he  tells  us, 
"  Many  people  were  at  first  surprised  at  my  using  the  new  words. 
Statistics  and  Statistical^  as  it  was  supposed  that  some  term  in  our 
own  language  might  have  expressed  the  same  meaning.  But  in 
the  course  of  a  very  extensive  tour,  through  the  northern  parts  of 
Europe,  which  I  happened  to  take  in  1786,  I  found  that  in 
Germany  they  were  engaged  in  a  species  of  political  enquiry, 
to  which  they  had  given  the  name  of  Statistics ;  ^  ....  as  I 
thought  that  a  new  word  might  attract  more  public  attention, 
I  resolved  on  adopting  it,  and  I  hope  that  it  is  now  completely 
naturalised  and  incorporated  with  our  language,"  This  hope 
was  certainly  justified,  but  the  meaning  of  the  word  underwent 
rapid  development  during  the  half  century  or  so  following  its 
introduction. 

4.  "Statistics"  (statistik),  as  the  term  is  used  by  German 
writers  of  the  eighteenth  century,  by  Zimmermann  and  by  Sir 
John  Sinclair,  meant  simply  the  exposition  of  the  noteworthy 
characteristics  of  a  state,  the  mode  of  exposition  being — almost 
inevitably  at  that  time — preponderantly  verbal.  The  conciseness 
and  definite  character  of  numerical  data  were  recognised  at  a 
comparatively  early  period — more  particularly  by  English  writers 
—  but  trustworthy  figures  were  scarce.  After  the  commencement 
of  the  nineteenth  century,  however,  the  growth  of  official  data 
was  continuous,  and  numerical  statements,  accordingly,  began 
more  and  more  to  displace  the  verbal  descriptions  of  earlier  days. 
"  Statistics  "  thus  insensibly  acquired  a  narrower  signification,  viz., 

^  Twenty-one  vols. ,  1791-99. 

2  Statistical  Account^  vol.  xx,,  Appendix  to  "The  History  of  the  Origin  and 
Progress  given  at  the  end  of  the  volume. 

*  Log.  cit.,  p.  xiii. 

*  The  Abriss  der  Statstoissenscha/t  der  Europdischen  Reiche  (1749)  of  Gottfried 
Achenwall,  Professor  of  Politics  at  Gottingen,  is  the  volume  in  which  the  word 
"statistik"  appears  to  be  first  employed,  l3ut  the  adjective  " statisticus " 
occurs  at  a  somewhat  earlier  date  in  works  written  in  Latin. 
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ihe  exposition  of  the  characteristics  of  a  State  by  numerical 
methods.  It  is  difficult  to  say  at  what  epoch  the  word  came 
definitely  to  bear  this  quantitative  meaning,  but  the  transition 
appears  to  have  been  only  half  accomplished  even  after  the  founda- 
tion of  the  Royal  Statistical  Society  in  1834.  The  articles  in  the 
first  volume  of  the  Journal^  issued  in  1838-9,  are  for  the  most 
part  of  a  numerical  character,  but  the  official  definition  has  no 
reference  to  method.  "Statistics,"  we  read,  "may  be  said,  in  the 
words  of  the  prospectus  of  this  Society,  to  be  the  ascertain- 
ing and  bringing  together  of  those  facts  which  are  calculated  to 
illustrate  the  condition  and  prospects  of  society."  ^  It  is,  however, 
admitted  that  "the  statist  commonly  prefers  to  employ  figures 
and  tabular  exhibitions." 

5.  Once,  however,  the  first  change  of  meaning  was  accomplished, 
further  changes  followed.  From  the  name  of  a  science  or  art  of 
state-description  by  numerical  methods,  the  word  was  transferred  to 
those  series  of  figures  with  which  it  operated,  as  we  speak  of  vital 
statistics,  poor-law  statistics,  and  so  forth.  But  similar  data 
occur  in  many  connections  ;  in  meteorology,  for  instance,  in  anthro- 
pology, etc.  Such  collections  of  numerical  data  were  also  termed 
"statistics,"  and  consequently,  at  the  present  day,  the  word  is 
held  to  cover  a  collection  of  numerical  data,  analogous  to  those 
which  were  originally  formed  for  the  study  of  the  state,  on  almost 
any  subject  whatever.  We  not  only  read  of  rainfall  "statistics," 
but  of  "statistics"  showing  the  growth  of  an  organisation  for 
recording  rainfall.''^  We  find  a  chapter  headed  "Statistics"  in  a 
book  on  psychology, 2  and  the  author,  writing  of  "statistics  con- 
cerning the  mental  characteristics  of  man,"  "statistics  of  children, 
under  the  headings  bright — average — duU."^  We  are  informed 
that,  in  a  book  on  Latin  verse,  the  characteristics  of  the  Virgilian 
hexameter  "are  examined  carefully  with  statistics."^ 

6.  The  development  in  meaning  of  the  adjective  "statistical" 
was  naturally  similar.  The  methods  applied  to  the  study  of 
numerical  data  concerning  the  state  were  still  termed  "  statistical 
methods,"  even  when  applied  to  data  trom  other  sources.  Thus 
we  read  of  the  inheritance  of  genius  being  treated  "  in  a  statistical 
manner,"^  and  we  have  now  "a  journal  for  the  statistical 
study  of  biological  problems."  ^    Such  phrases  as  "  the  statistical 

*  Jour.  Stat.  Soc,  vol.  i.  p  1. 

'■^  Symons'  British  Rainfall  for  1899,  p.  15. 

2  E.  W.  Scri])ture,  The  New  Psychology,  1897,  chap.  ii. 

*  Op.  cU.  p.  18. 
AthencBum,  Oct.  3,  1903. 

"  Francis  Galton,  Hereditary  Oenius  (Macmillan,  1869),  preface. 

'  Biometrika,  Cambridge  Univ.  Press,  the  first  number  issued  in  1901 
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investigation  of  the  motion  of  molecules "  ^  have  become  part  of 
the  ordinary  language  of  physicists.  We  find  a  work  entitled 
"the  principles  of  statistical  mechanics," ^  and  the  Bakerian 
lecture  for  1909,  by  Sir  J.  Larmor,  was  on  "the  statistical  and 
thermodynamical  relations  of  radiant  energy." 

7.  It  is  unnecessary  to  multiply  such  instances  to  show  that  the 
words  "statistics,"  "statistical,"  no  longer  bear  any  necessary 
reference  to  "  matters  of  state."  They  are  applied  indifferently  in 
physics,  biology,  anthropology,  and  meteorology,  as  well  as  in  the 
social  sciences.  Diverse  though  these  cases  are,  there  must  be 
some  community  of  character  between  them,  or  the  same  terms 
and  the  same  methods  would  not  be  applied.  What,  then,  is  this 
common  character  1 

8.  Let  us  turn  to  social  science,  as  the  parent  of  the  methods 
termed  "  statistical,"  for  a  moment,  and  consider  its  characteristics 
as  compared,  say,  with  physics  or  chemistry.  One  characteristic 
stands  out  so  markedly  that  attention  has  been  repeatedly 
directed  to  it  by  "  statistical "  writers  as  the  source  of  the  peculiar 
difficulties  of  their  science  —  the  observer  of  social  facts  cannot  ex- 
periment, but  must  deal  with  circumstances  as  they  occur,  apart 
from  his  control.  Now  the  object  of  experiment  is  to  replace  the 
complex  systems  of  causation  usually  occurring  in  nature  by 
simple  systems  in  which  only  one  causal  circumstance  is  permitted 
to  vary  at  a  time.  This  simplification  being  impossible,  the 
observer  has,  in  general,  to  deal  with  highly  complicated  cases  of 
multiple  causation — cases  in  which  a  given  result  may  be  due  to 
any  one  of  a  number  of  alternative  causes  or  to  a  number  of 
different  causes  acting  conjointly. 

9.  A  little  consideration  will  show,  however,  that  this  is  also 
precisely  the  characteristic  of  the  observations  in  other  tields  to 
which  statistical  methods  are  applied.  The  meteorologist,  for 
example,  is  in  almost  precisely  the  same  position  as  the  student 
of  social  science.  He  can  experiment  on  minor  points,  but  the 
records  of  the  barometer,  thermometer,  and  rain  gauge  have  to  be 
treated  as  they  stand.  With  the  biologist,  matters  are  in  some- 
what better  case.  He  can  and  does  apply  experimental  methods 
to  a  very  large  extent,  but  frequently  cannot  approximate 
closely  to  the  experimental  ideal  ;  the  internal  circumstances  of 
animals  and  plants  too  easily  evade  complete  control.  Hence  a 
large  field  (notably  the  study  of  variation  and  heredity)  is  left, 
in  which  statistical  methods  have  either  to  aid  or  to  replace  the 
methods  of  experiment.     The  physicist  and  chemist,  finally, 

1  Clerk  Maxwell,  "Theory  of  Heat"  (1871),  aud  "On  Boltzmauu's 
Theorem"  (1878),  Camh.  Phil.  Traits.,  vol.  xii. 

2  By  J.  Willard  Gibbs  (Macinillan,  1902). 
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stand  at  the  other  extremity  of  the  scale.  Theirs  are  the 
sciences  in  which  experiment  has  heen  brought  to  its  greatest 
perfection.  But  even  so,  statistical  methods  still  find  application. 
In  the  first  place,  the  methods  available  for  eliminating  the  effect 
of  disturbing  circumstances,  though  continually  improved,  are  not, 
and  cannot  be,  absolutely  perfect.  The  observer  himself,  as  well 
as  the  observing  instrument,  is  a  source  of  error ;  the  effects  of 
changes  of  temperature,  or  of  moisture,  of  pressure,  draughts,  vibra- 
tion, cannot  be  completely  eliminated.  Further,  in  the  problems 
of  molecular  physics,  referred  to  in  the  last  sentences  of  §  6, 
multiplicity  of  causes  is  of  the  essence  of  the  case.  The  motion 
of  an  atom  or  of  a  molecule  in  the  middle  of  a  swarm  is  dependent 
on  that  of  every  other  atom  or  molecule  in  the  swarm. 

10.  In  the  light  of  this  discussion,  we  may  accordingly  give  the 
following  definitions : — 

By  statistics  we  mean  quantitative  data  affected  to  a  marked 
extent  by  a  multiplicity  of  causes. 

By  statistical  methods  we  mean  methods  specially  adapted  to 
the  elucidation  of  quantitative  data  affected  by  a  multiplicity  of 
causes. 

By  theory  of  statistics  we  mean  the  exposition  of  statistical 
methods. 

The  insertion  in  the  first  definition  of  some  such  words  as  "to 
a  marked  extent  "  is  necessary,  since  the  term  "  statistics  "  is  not 
usually  applied  to  data,  like  those  of  the  physicist,  which  are 
affected  only  by  a  relatively  small  residuum  of  disturbing  causes. 
At  the  same  time,  "  statistical  methods  "  are  applicable  to  all  such 
cases,  whether  the  influence  of  many  causes  be  large  or  not. 
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History  of  Official  Statistics. 

(8)  Bertillon,  J.,  Cours  6Umentaire  de  statistiqxte  \  Soci^t^  d'^ditions 
scientifiques,  1895.  (Gives  an  exceedingly  useful  outline  of  the  history 
of  official  statistics  in  different  countries.) 


PART  L— THE  THEORY  OF  ATTRIBUTES. 


CHAPTER  I. 
NOTATION  AND  TERMINOLOGY. 

1-2.  Statistics  of  attributes  and  statistics  of  variables  :  fundamental  character 
of  the  former — 3-5.  Classification  by  dichotomy — 6-7.  Notation  for 
single  attributes  and  for  combinations — 8.  The  class-frequency —9. 
Positive  and  negative  attributes,  contraries — 10.  The  order  of  a  class — 
11.  The  aggregate — 12.  The  arrangement  of  classes  by  order  and 
aggregate — 13-14.  Sufficiency  of  the  tabulation  of  the  ultimate  class- 
frequencies — 1.^-17.  Or,  better,  of  the  positive  class-frequencies — 18. 
The  class-frequencies  chosen  in  the  census  for  tabulation  of  statistics 
of  infirmities — 1 9.  Inclusive  and  exclusive  notations  and  terminologies. 

1.  The  methods  of  statistics,  as  defined  in  the  Introduction, 
deal  with  quantitative  data  alone.  The  quantitative  character 
may,  however,  arise  in  two  different  ways. 

In  the  first  place,  the  observer  may  note  only  the  presence  or 
absence  of  some  attribute  in  a  series  of  objects  or  individuals,  and 
count  how  many  do  or  do  not  possess  it.  Thus,  in  a  given 
population,  we  may  count  the  number  of  the  blind  and  seeing, 
the  dumb  and  speaking,  or  the  insane  and  sane.  The  quantitative 
character,  in  such  cases,  arises  solely  in  the  counting. 

In  the  second  place,  the  observer  may  note  or  measure  the 
actual  magnitude  of  some  variable  character  for  each  of  the 
objects  or  individuals  observed.  He  may  record,  for  instance,  the 
ages  of  persons  at  death,  the  prices  of  different  samples  of  a 
commodity,  the  statures  of  men,  the  numbers  of  petals  in  flowers. 
The  observations  in  these  cases  are  quantitative  ab  initio. 

2.  The  methods  applicable  to  the  former  kind  of  observations, 
which  may  be  termed  statistics  of  attributes,  are  also  applicable 
to  the  latter,  or  statistics  of  variables.  A  record  of  statures  of 
men,  for  example,  may  be  treated  by  simply  counting  all  measure- 
ments as  tall  that  exceed  a  certain  limit,  neglecting  the  magnitude 
of  excess  or  defect,  and  stating  the  numbers  of  tall  and  short  (or 
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more  strictly  not-tall)  on  the  basis  of  this  classification.  Similarly, 
the  methods  that  are  specially  adapted  to  the  treatment  of 
statistics  of  variables,  making  use  of  each  value  recorded,  are 
available  to  a  greater  extent  than  might  at  first  sight  seem  possible 
for  dealing  with  statistics  of  attributes.  For  example,  we  may 
treat  the  presence  or  absence  of  the  attribute  as  corresponding  to 
the  changes  of  a  variable  which  can  only  possess  two  values,  say 
0  and  1.  Or,  we  may  assume  that  we  have  really  to  do  with  a 
variable  character  which  has  been  crudely  classified,  as  suggested 
above,  and  we  may  be  able,  by  auxiliary  hypotheses  as  to  the 
nature  of  this  variable,  to  draw  further  conclusions.  But  the 
methods  and  principles  developed  for  the  case  in  w^hich  the  observer 
only  notes  the  presence  or  absence  of  attributes  are  the  simplest 
and  most  fundamental,  and  are  best  considered  first.  This  and 
the  next  three  chapters  (Chapters  I. -IV.)  are  accordingly  devoted 
to  the  Theory  of  Attributes. 

3.  The  objects  or  individuals  that  possess  the  attribute,  and 
those  that  do  not  possess  it,  may  be  said  to  be  members  of  two 
distinct  classes,  the  observer  classifying  the  objects  or  individuals 
observed.  In  the  simplest  case,  where  attention  is  paid  to  one 
attribute  alone,  only  two  mutually  exclusive  classes  are  formed. 
If  several  attributes  are  noted,  the  process  of  classification  may, 
however,  be  continued  indefinitely.  Those  that  do  and  do  not 
possess  the  first  attribute  may  be  reclassified  according  as  they  do 
or  do  not  possess  the  second,  the  members  of  each  of  the  sub- 
classes so  formed  according  as  they  do  or  do  not  possess  the 
third,  and  so  on,  every  class  being  divided  into  two  at  each  step. 
Thus  the  members  of  the  population  of  any  district  may  be 
classified  into  males  and  females ;  the  members  of  each  sex  into 
sane  and  insane ;  the  insane  males,  sane  males,  insane  females, 
and  sane  females  into  blind  and  seeing.  If  we  were  dealing  with 
a  number  of  peas  {Pisum  sativum)  of  different  varieties,  they 
might  be  classified  as  tall  or  dwarf,  with  green  seeds  or  yellow 
seeds,  with  wrinkled  seeds  or  round  seeds,  so  that  we  would  have 
eight  classes — tall  with  round  green  seeds,  tall  with  round  yellow 
seeds,  tall  with  wrinkled  green  seeds,  tall  with  wrinkled  yellow 
seeds,  and  four  similar  classes  of  dwarf  plants. 

4.  It  may  be  noticed  that  the  fact  of  classification  docs  not 
necessarily  imply  the  existence  of  either  a  natural  or  a  clearly 
defined  boundary  between  the  two  classes.  The  boundary  may 
be  wholly  arbitrary,  e.g.  where  prices  are  classified  as  above  or 
below  some  special  value,  barometer  readings  as  above  or  below 
some  particular  height.  The  division  may  also  be  vague  and 
uncertain :  sanity  and  insanity,  sight  and  blindness,  pass 
into  each  other  by  such  fine  gradations  that  judgments  may 
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differ  as  to  the  class  in  which  a  given  individual  should  be 
entered.  The  possibility  of  uncertainties  of  this  kind  should 
always  be  borne  in  mind  in  considering  statistics  of  attributes : 
whatever  the  nature  of  the  classification,  however,  natural  or 
artificial,  definite  or  uncertain,  the  final  judgment  must  be  de- 
cisive ;  any  one  object  or  individual  must  be  held  either  to  possess 
the  given  attribute  or  not. 

5.  A  classification  of  the  simple  kind  considered,  in  which  each 
class  is  divided  into  two  sub-classes  and  no  more,  has  been  termed 
by  logicians  classification,  or,  to  use  the  more  strictly  applicable 
term,  diyisigiLJby^^ifibQiom^  (cutting  in  two).  The  classifica- 
tions of  most  statistics  arc  not  dichotomous,  for  most  usually  a 
class  is  divided  into  more  than  two  sub-classes,  but  dichotomy  is 
the  fundamental  case.  In  Chapter  V.  the  relation  of  dichotomy 
to  more  elaborate  (manifold,  instead  of  twofold  or  dichotomous) 
processes  of  classification,  and  the  methods  applicable  to  some 
such  cases,  are  dealt  with  briefly. 

6.  For  theoretical  purposes  it  is  necessary  to  have  some  simple 
notation  for  the  classes  formed,  and  for  the  numbers  of  observa- 
tions assigned  to  each. 

The  capitals  A,  B,  Gy  .  .  .  will  be  used  to  denote  the  several 
attributes.  An  object  or  individual  possessing  the  attribute  A 
will  be  termed  simply  A.  The  class,  all  the  members  of  which 
possess  the  attribute  A,  will  be  termed  the  class  A.  It  is  con- 
venient to  use  single  symbols  also  to  denote  the  absence  of  the 
attributes  A^  B,  .  .  .  We  shall  employ  the  Greek  letters,  a, 
^,  y,  .  .  .  Thus  if  A  represents  the  attribute  blindness,  a 
represents  sight,  i.e.  non-blindness  ;  if  B  stands  for  deafness,  ^ 
st2indiS  ioY  hearing.  Generally  "a"  is  equivalent  to  "non-J[,"  or 
an  object  or  individual  not  possessing  the  att7'ibute  A  ;  the  class  a 
is  equivalent  to  the  class  none  of  the  members  of  which  possess  the 
attribute  A. 

7.  Combinations  of  attributes  will  be  represented  by  juxta- 
positions of  letters.  Thus  if,  as  above,  A  represents  blindness,  B 
deafness,  AB  represents  the  combination  blindness  and  deafness. 
If  the  presence  and  absence  of  these  attributes  be  noted,  the  four 
classes  so  formed,  viz.  AB,  A 13,  aB,  ajS,  include  respectively  the 
blind  and  deaf,  the  blind  but  not-deaf  the  deaf  but  not-blind,  and 
the  neither  blind  nor  deaf.  If  a  third  attribute  be  noted,  e.g.  in- 
sanity, denoted  say  by  C,  the  class  ABC,  includes  those  who  are 
at  once  deaf,  blind,  and  insane,  A  By  those  who  are  deaf  and  blind 
but  not  insan^,,  and  so  on. 

Any  letter  or  combination  of  letters  like  A,  AB,  aB,  ABy,  by 
means  of  which  we  specify  the  characters  of  the  members  of  a  class, 
may  be  termed  a  class  symbol. 
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8.  The  number  of  observations  assigned  to  any  class  is  termed, 
for  brevity,  the  frequency  of  the  class,  or  the  class-frequency. 
Class-frequencies  will  be  denoted  by  enclosing  the  corresponding 
class-symbols  in  brackets.    Thus — 

(A)    denotes  number  of  J 's,      i.e.  objects  possessing  attribute  A 

(a)  ,,  a's,  ,,  not     ,,  A 

(AB)  „  „  AB's,  „  possessing  attributes  ^  and  5 

(aB)  „  „  aB%  ,,  ,,             „  but  not  .4 

(ABC)  „  „  ABCs,  „  „             „  A,B,audC 

(aBC)  „  „  aBC's,  „  „             „        J5  and  C  but  not  A 

(apC)  „  „  a/3C's,  ,,  „             „        C  but  neitber  ^  nor  B 

and  so  on  for  any  number  of  attributes.  If  A  represent,  as  in 
the  illustration  above,  blindness,  JJ  deafness,  C  insanity,  the 
symbols  given  stand  for  the  numbers  of  the  blind,  the  not-blind, 
the  blind  and  deaf,  the  deaf  but  not  blind,  the  blind,  deaf,  and  in- 
sane, the  deaf  and  insane  but  not  blind,  and  the  insane  but  neither 
blind  nor  deaf,  respectively. 

9.  The  attributes  denoted  by  capitals  ABC,  .  .  .  may  be 
termed  positive  attributes,  and  their  contraries  denoted  by  Greek 
letters  negative  attributes.  If  a  class-symbol  include  only 
capital  letters,  the  class  may  be  termed  a  positive  class ;  if  only 
Greek  letters,  a  negative  class.  Thus  the  classes  A,  AB,  ABC 
are  positive  classes ;  the  classes  a,  a^,  a/Sy,  negative  classes. 

If  two  classes  are  such  that  every  attribute  in  the  symbol  for 
the  one  is  the  negative  or  contrary  of  the  corresponding  attribute 
in  the  symbol  for  the  other,  they  may  be  termed  contrary  classes 
and  their  frequencies  contrary  frequencies;  e.g.  AB  and  ajS,  A(i 
and  aB,  AfiC  and  aBy,  are  pairs  of  contraries. 

]  0.  The  classes  obtained  by  noting  say  7i  attributes  fall  into 
natural  groups  according  to  the  numbers  of  attributes  used  to 
specify  the  respective  classes,  and  these  natural  groups  should  be 
borne  in  mind  in  tabulating  the  class-frequencies.  A  class 
specified  by  r  attributes  may  be  spoken  of  as  a  class  of  the  »'th 
order  and  its  frequency  as  a  frequency  of  the  rth  order.  Thus  AB, 
AC,  BC  are  classes  of  the  second  order ;  (A),  (Af3),  (aBC), 
{AByD),  class-frequencies  of  the  first,  second,  third,  and  fourth 
orders  respectively. 

11.  The  classes  of  one  and  the  same  order  fall  into  further 
groups  according  to  the  actual  attributes  specified.  Thus  if  three 
attributes  A,B,C  have  been  noted,  the  classes  of  the  second  order 
may  be  specified  by  any  one  of  the  pairs  of  attributes  AB,  AC,  or 
BC  (and  their  contraries).  The  series  of  classes  or  class-frequen- 
cies given  by  any  one  positive  class  and  the  classes  whose  symbols 
are  derived  therefrom  by  substituting  Greek  letters  for  one  or 
more  of  the  italic  capital  letters  in  every  possible  way  will  be 
termed  an  aggregate.    Thus  {AB)  (Afi)  (aB)  (ajS)  form  an  aggre- 
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gate  of  frequencies  of  the  second  order,  and  the  twelve  classes  of 
the  second  order  which  can  be  formed  where  three  attributes 
have  been  noted  may  be  grouped  into  three  such  aggregates. 

12.  Class-frequencies  should,  in  tabulating,  be  arranged  so  that 
frequencies  of  the  same  order  and  frequencies  belonging  to  the 
same  aggregate  are  kept  together.  Thus  the  frequencies  for  the 
case  of  three  attributes  should  be  grouped  as  given  below  ;  the 
whole  number  of  observations  denoted  by  the  letter  N  being 
reckoned  as  a  frequency  of  order  zero,  since  no  attributes  are 
specified : — 

Order  0.     N  \ 
Order  1. 

Order  2. 


Order  3. 


13.  In  such  a  complete  table  for  the  case  of  three  attributes, 
twenty-seven  distinct  frequencies  are  given  : — 1  of  order  zero,  6 
of  the  first  order,  12  of  the  second,  and  8  of  the  third.  It 
is,  however,  in  no  case  necessary  to  give  such  a  complete 
statement. 

The  whole  number  of  observations  must  clearly  be  equal  to  the 
number  of  ^'s  together  with  the  number  of  a's,  the  number  of 
^'s  to  the  number  of  J's  that  are  B  together  with  the  number  of 
^'s  that  are  not  B  ;  and  so  on, — i.e.  any  class-frequency  can  always 
be  expressed  in  terms  of  class-frequencies  of  higher  order.    Thus — 

N={A)  +  {a)  =  {B)  +  {p)  =  etc. 
=  (AB)  +  {AI3)-^  (aB)  4-  (a/S)  =  etc. 
(A)  =  (AB)  +  {A(3)  =  {A  C)  +  {Ay)  =  etc. 
{AB)  =  {ABC)  -f  {A By)  =  etc. 

Hence,  instead  of  enumerating  all  the  frequencies  as  under  (1), 
no  more  need  be  given,  for  the  case  of  three  attributes,  than 
the  eight  frequencies  of  the  third  order.  If  four  attributes  had 
been  noted  it  would  be  sufficient  to  give  the  sixteen  frequencies  of 
the  fourth  order. 

The  classes  specified  by  all  the  attributes  noted  in  any  case, 
i.e.  classes  of  the  nth  order  in  the  case  of  n  attributes,  may  be 


(B) 

(") 

(-8) 

(y) 

(AB) 

(AC) 

(BC) 

(Ay) 

(Jiy) 

(aB) 

(aV) 

(flG) 

(«7) 

(Py) 

(ABC) 

(aBC) 

(ABy) 

(aBy) 

(ApC) 

(a/SC) 

■  (1) 


(2) 


12 


THEORY  OF  STATISTICS. 


termed  the  ultimate  classes  and  their  frequencies  the  ultimate 
frequencies.  Hence  we  may  say  that  it  is  never  necessary  to 
enumerate  more  than  the  ultimate  frequencies.  All  the  others  can 
be  obtained  from  these  by  simple  addition. 

Example  i. — (See  reference  5  at  the  end  of  the  chapter.) 
A  number  of  school  children  were  examined  for  the  presence 
or  absence  of  certain  defects  of  which  three  chief  descriptions 
were  noted,  A  development  defects,  B  nerve  signs,  C  low 
nutrition. 

Given  the  following  ultimate  frequencies,  find  the  frequencies 
of  the  positive  classes,  including  the  whole  number  of  obser- 
vations N. 


(ABC) 

57 

(aBC) 

78 

(ABy) 

281 

(aBy) 

670 

(AI3C) 

86 

{a/3C) 

65 

453 

(a/?y) 

8310 

The  whole  number  of  observations  N  is  equal  to  the  grand 
total:  10,000. 

The  frequency  of  any  first-order  class,  e.g.  (A)  is  given  by  the 
total  of  the  four  third-order  frequencies,  the  class-symbols  for 
which  contain  the  same  letter — 

(ABC)  +  (^^y)  +  (Af3C)  -f  (^^y)  =  (A)  =  877. 

Similarly,  the  frequency  of  any  second-order  class,  e.g.  (AB),  is 
given  by  the  total  of  the  two  third-order  frequencies,  the  class- 
symbols  for  which  both  contain  the  same  pair  of  letters — 

(ABC)  +  (ABy)  =  (AB)  =  338. 

The  complete  results  are — 


10,000 

(AB) 

338 

(^) 

877 

(AC) 

143 

1,086 

(BC) 

135 

(C) 

286 

(ABC) 

57 

14.  The  number  of  ultimate  frequencies  in  the  general  case  of 
71  attributes,  or  the  number  of  classes  in  an  aggregate  of  the  7ith 
order,  is  given  by  considering  that  each  letter  of  the  class-symbol 
may  be  written  in  two  ways  (A  or  a,  B  or  jS,  C  or  y),  and  that 
either  way  of  writing  one  letter  may  be  combined  with  either 
way  of  writing  another.  Hence  the  whole  number  of  ways  in 
which  the  class-symbol  may  be  written,  i.e.  the  number  of 
classes,  is — 

2x2x2x2....  =2". 
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The  ultimate  frequencies  form  one  natural  set  in  terms  of  which 
the  data  are  completely  given,  but  any  other  set  containing  the 
same  number  of  algebraically  independent  frequencies,  viz.  2", 
may  be  chosen  instead. 

15.  The  positive  class-frequencies,  including  under  this  head  the 
total  number  of  observations  N,  form  one  such  set.  They  are  alge- 
braically independent ;  no  one  positive  class-frequency  can  be  ex- 
pressed wholly  in  terms  of  the  others.  Their  number  is,  moreover, 
2",  as  may  be  readily  seen  from  the  fact  that  if  the  Greek  letters 
are  struck  out  of  the  symbols  for  the  ultimate  classes,  they  become 
the  symbols  for  the  positive  classes,  with  the  exception  of  apy 
....  for  which  N  must  be  substituted.  Otherwise  the  number 
is  made  up  as  follows  : — 

Order  0.  (The  whole  number  of  observations)  ...  1 
Order  1.    (The  number  of  attributes  noted)     ....  n 

n{n  —  1) 

Order  2.    (The  number  of  combinations  of  n  things  2  together)  — ^-^ — 

nin  —  1)(m  —  2) 

Orders.    (The  number  of  combinations  of    things  3  together)  f~9~3  — 

and  so  on.    But  the  series 

l+n+^^-—  +  ^172:3—  

is  the  binomial  expansion  of  (1-1-1)"  or  2",  therefore  the  total 
number  of  positive  classes  is  2". 

16.  The  set  of  positive  class-frequencies  is  a  most  convenient 
one  for  both  theoretical  and  practical  purposes. 

Compare,  for  instance,  the  two  forms  of  statement,  in  terms  of 
the  ultimate  and  the  positive  classes  respectively,  as  given  in 
Example  i.,  §  13.  The  latter  gives  directly  the  whole  number  of 
observations  and  the  totals  of  ^'s,  i^'s,  and  C's.  The  former  gives 
none  of  these  fundamentally  important  figures  without  the  perfor- 
mance of  more  or  less  lengthy  additions.  Further,  the  latter  gives 
the  second-order  frequencies  (AB),  (AC),  and  (BC),  which  are  neces- 
sary for  discussing  the  relations  subsisting  between  A,  B,  and  C,  but 
are  only  indirectly  given  by  the  frequencies  of  the  ultimate  classes. 

17.  The  expression  of  any  class-frequency  in  terms  of  the 
positive  frequencies  is  most  easily  obtained  by  a  process  of  step- 
by-step  substitution  ;  thus — 

(a/3)  =(a)-M) 

==Jf-(A)-{B)  +  {AB)  (3) 

(a^y)  =  (a;8)-(a/3C) 

=  ]f-(A)-  (B)  +  (AH)  -  (a(7)  +  (allC) 

^Jf-(A)-  (B)  -  (C)  +  (AB)  +  (AC)  +  {BC)  -  (ABC)  (4) 
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Arithmetical  work,  however,  should  be  executed  from  first 
principles,  and  not  by  quoting  formulae  like  the  above. 

Example  ii. — Check  the  work  of  Example  i.,  §  13,  by  finding  the 
frequencies  of  the  ultimate  classes  from  the  frequencies  of  the 
positive  classes. 

(ABy)  =  {AB)  -  {ABC)  =  338  -  57  =  281 
(A^y)  =  {Ay)  -  {ABy)  =  {A)  -{AC)-  {ABy) 
=  877  -  143  -  281  =  453 

{ajSy)  =  (/3y)  -  {Afiy)  =  N -  {B)  -  {C)  +  {BC)  -  {Afiy) 

=  10,000  -  1086  -  286  +  135  -  453 
=  10,135-  1825  =  8310 

and  so  on. 

18,  Examples  of  statistics  of  precisely  the  kind  now  under 
consideration  are  afforded  by  the  census  returns,  e.g.^  of  1891  or 
1901,  for  England  and  Wales,  of  persons  suffering  from  different 
"  infirmities,"  any  individual  who  is  deaf  and  dumb,  blind  or 
mentally  deranged  (lunatic,  imbecile,  or  idiot)  being  required  to 
be  returned  as  such  on  tlie  schedule.  The  classes  chosen  for 
tabulation  are,  however,  neither  the  positive  nor  the  ultimate 
classes,  but  the  following  (neglecting  minor  distinctions  amongst 
the  mentally  deranged  and  the  returns  of  persons  who  are  deaf 
but  not  dumb)  : — Dumb,  blind,  mentally  deranged ;  dumb  and 
blind  but  not  deranged ;  dumb  and  deranged  but  not  blind  ; 
blind  and  deranged  but  not  dumb ;  blind,  dumb,  and  deranged. 
If,  in  the  symbolic  notation,  deaf-mutism  be  denoted  by  A,  blind- 
ness by  B,  and  mental  derangement  by  (7,  the  class-frequencies 
thus  given  are  {A),  {B),  (C),  (ABy),  {A/3C),  {aBC),  {ABC)  (cf. 
Census  of  England  and  Wales,  1891,  vol.  iii.,  tables  15  and  16, 
p.  Ivii.  Census  of  1901,  Summary  Tables,  table  xlix.).  This  set  of 
frequencies  does  not  appear  to  possess  any  special  advantages. 

19.  The  symbols  of  our  notation  are,  it  should  be  remarked, 
used  in  an  inclusive  sense,  the  symbol  A,  for  example,  signifying 
an  object  or  individual  possessing  the  attribute  A  with  or  without 
others.  This  seems  to  be  the  only  natural  use  of  the  synjbol, 
but  at  least  one  notation  has  been  constructed  on  an  exclusive 
basis  {cf.  ref.  5),  the  symbol  A  denoting  that  the  object  or  in- 
dividual possesses  the  attribute  A,  but  not  B  or  C  or  D,  or  what- 
ever other  attributes  have  been  noted.  An  exclusive  notation  is 
apt  to  be  relatively  cumbrous  and  also  ambiguous,  for  the  reader 
cannot  know  what  attributes  a  given  symbol  excludes  until  he 
has  seen  the  whole  list  of  attributes  of  which  note  has  been 
taken,  and  this  list  he  must  bear  in  mind.  The  statement  that 
the  symbol  A  is  used  exclusively  cannot  mean,  obviously,  that  the 
object  referred  to  possesses  only  the  attribute  A  and  no  others 
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whatever ;  it  merely  excludes  the  other  attributes  noted  in  the 
particular  investigation.  Adjectives,  as  well  as  the  symbols  which 
may  represent  them,  are  naturally  used  in  an  inclusive  sense,  and 
care  should  therefore  be  taken,  when  classes  are  verbally  described, 
that  the  description  is  complete,  and  states  what,  if  anything,  is 
excluded  as  well  as  what  is  included,  in  the  same  way  as  our 
notation.  The  terminology  of  the  English  census  has  not,  in 
this  respect,  been  quite  clear.  The  "  Blind  "  includes  those  who 
are  "  Blind  and  Dumb,"  or  "  Blind,  Dumb,  and  Lunatic,"  and  so 
forth.  But  the  heading  "  Blind  and  Dumb,"  in  the  table  relating 
to  "  combined  infirmities,"  is  used  in  the  sense  "  Blind  and  Dumb, 
but  not  Lunatic  or  Imbecile,"  etc.,  and  so  on  for  the  others.  In 
the  first  table  the  headings  are  inclusive,  in  the  second  exclusive. 


(1)  Jevons,  W.  Stanley,  "On  a  General  System  of  Numerically  Definite 

Reasoning,"  Memoirs  of  the  Manchester  Lit.  and  Phil.  Soc,  1870. 
Reprinted  in  Pure  Logic  and  other  Minor  Works  ;  Macmillan,  1890. 
(The  method  used  in  these  chapters  is  that  of  Jevons,  with  the  notation 
slightly  modified  to  that  employed  in  the  next  three  memoirs  cited.) 

(2)  YULK,  G.  U.,  "On  the  Association  of  Attributes  in  Statistics,  etc.,"  Phil. 

Trans.  Roy.  Soc,  Series  A,  vol.  cxciv.,  1900,  p.  257. 

(3)  Yule,  G.  U.  ,  "On  the  Theory  of  Consistence  of  Logical  Class- frequencies 

and  its  Geometrical  Representation,"  Phil.  Trans.  Boy.  Soc,  Series  A, 
vol.  cxcvii.,  1901,  p.  91. 

(4)  Yule,  G.  U.,  "Notes  on  the  Theory  of  Association  of  Attributes  in 

Statistics,"  Biometrika,  vol.  ii. ,  1903,  p.  121.  (The  first  three  sections 
of  (4)  are  an  abstract  of  (2)  and  (3).  The  remarks  made  as  regards  the 
tabulation  of  class- frequencies  at  the  end  of  (2)  should  be  read  in  con- 
nection with  the  remarks  made  at  the  beginning  of  (3)  and  in  this 
chapter  :  cf.  footnote  on  p.  94  of  (3). 

Material  has  been  cited  from,  and  reference  made  to  the  notation  used  in — 

(5)  Warner,  F.,  and  others,  "  Report  on  theScientificStndy  of  the  Mental  and 

Physical  Conditions  of  Childhood"  ;  publislied  by  the  Committee, 
Parkes  Museum,  1895. 

(6)  Warner,  F.,  "Mental  and  Physical  Conditions  among  Fifty  Thousand 

Children,  etc.," /own  Hoy.  Stat.  Soc,  vol.  lix.,  1896,  p.  125. 


1.  (Figures  from  ref.  (5).)  The  following  are  the  numbers  of  boys  observed 
with  certain  classes  of  defects  amongst  a  number  of  school-children.  A. 
denotes  development  defects  ;  £,  nerve  signs  ;  C,  low  nutrition. 
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EXERCISES. 


{ABC)  149 

{ABy)  738 

(A^C)  225 

{A$y)  1,196 


(aBC) 
(aBy) 

(a^y) 


204 
1,762 
171 
21,842 


Find  the  frequencies  of  the  positive  classes. 
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2.  (FigTires  from  ref.  (5).)  The  following  are  the  frequencies  of  the 
positive  classes  for  the  girls  in  the  same  investigation  : — 


N 

23,713 

[AB) 

587 

{A) 

1,618 

{AC) 

428 

iB) 

2,015 

(BC) 

335 

{O 

770 

{ABC) 

156 

Find  the  frequencies  of  the  ultimate  classes. 

3.  (Figures  from  Census,  England  and  Wales,  1891,  vol.  iii.)  Convert  the 
census  statement  as  below  into  a  statement  in  terais  of  {a)  the  positive,  (6) 
the  ultimate  class-frequencies.  ^=  blindness,  5  =  deaf- mutism,  C=  mental 
derangement. 


N 

29,002,525 

{ABy) 

82 

{A) 

23,467 

(A^O) 

380 

(B) 

14,192 

{aBC) 

500 

(C) 

97,383 

{ABC) 

25 

4.  {Cf.  Mill's  Logic,  bk.  iii.,  ch.  xvii.,  and  ref.  (1).)  Show  that  if  A 
occurs  in  a  larger  proportion  of  the  cases  where  B  is  than  where  B  is  not, 
then  will  B  occur  in  a  larger  proportion  of  the  cases  where  A  is  than  where 
A  is  not:  i.e.  given  {AB)I{B)>  {A  l3)/{&),  show  tlmt  {A  B)I{A)>  {aB)l{a). 

5.  {Cf.  De  Morgan,  Formal  Logic,  p.  163,  and  ref.  (1).)  Most  B's  are  A^s, 
most  5's  are  (7's :  find  the  least  number  of  ^'s  that  are  (7's,  i.e.  the  lowest 
possible  value  of  {AC). 

6.  Given  that 

{A)  =  {a)  =  {B)  =  {fi)  =  lN, 

show  that 

{AB)  =  {a&),  {Afi)^{aB). 

7.  {Cf.  ref.  (2),  §  9,  "Case  of  equality  of  contraries.")    Given  that 

{A)  =  {a)-={B)^{^)  =  {C)^{y)  =  ^N, 

and  also  that 

{ABC)  =  {afiy), 

show  that 

2  {ABC)  =  {AB)  +  {A  C)  +  { BC)  -  IN. 

8.  Measurements  are  made  on  a  thousand  husbands  and  a  thousand  wives. 
If  the  measurements  of  the  husbands  exceed  the  measurements  of  the  wives  in 
800  cases  for  one  measurement,  in  700  cases  for  another,  and  in  660  cases  for 
both  measurements,  in  how  many  case.s  will  both  measurements  on  the  wife 
exceed  the  measurements  on  the  husband  ? 


CHAPTER  II. 


CONSISTENCE. 

1-3.  The  field  of  observation  or  universe  and  its  specification  by  symbols — 
4,  Derivation  of  complex  from  simple  relations  by  specifying  the 
universe— 5-6.  Consistence — 7-10.  Conditions  of  consistence  for  one 
and  for  two  attributes— 11-14.  Conditions  of  consistence  for  three 
attributes. 

1.  Any  statistical  inquiry  is  necessarily  confined  to  a  certain 
time,  space,  or  material.  An  investigation  on  the  prevalence  of 
insanity,  for  instance,  may  be  limited  to  England,  to  England  in 
1901,  to  English  males  in  1901,  or  even  to  English  males  over  60 
years  of  age  in  1901,  and  so  on. 

For  actual  work  on  any  given  subject,  no  term  is  required  to 
denote  the  material  to  which  the  work  is  so  confined :  the  limits 
are  specified,  and  that  is  sufficient.  But  for  theoretical  purposes 
some  term  is  almost  essential  to  avoid  circumlocution.  The  ex- 
pression the  universe  of  discourse,  or  simply  the  universe,  used 
in  this  sense  by  writers  on  logic,  may  be  adopted  as  familiar  and 
convenient. 

2.  The  universe,  like  any  class,  may  be  considered  as  specified 
by  an  enumeration  of  the  attributes  common  to  all  its  members, 
e.g.  to  take  the  illustration  of  §  1,  those  implied  by  the  predicates 
English^  male,  over  60  years  of  age,  livirig  in  1901.  It  is  not,  in 
general,  necessary  to  introduce  a  special  letter  into  the  class- 
symbols  to  denote  the  attributes  common  to  all  members  of  the 
universe.  We  know  that  such  attributes  must  exist,  and  the 
common  symbol  can  be  imderstood. 

In  strictness,  however,  the  symbol  ought  to  be  written  :  if,  say, 
U  denote  the  combination  of  attributes,  English — male — over  60 
— living  in  1901,  A  insanity,  B  blindness,  we  should  strictly  use 
the  symbols — 

{U)      =  Number  of  English  males  over  60  living  in  1901, 
{UA)  =  insane  English  males  over  60  living  in  1901, 

{UB)    =       „  blind 

(UAB)—  blind  and  insane  English  males  over  60  living  iu  1901, 
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instead  of  the  simpler  symbols  N  {A)  {B)  (AB).  Similarly,  the 
general  relations  (2),  §  13,  Chap,  I.,  using  U  to  denote  the  common 
attributes  of  all  the  members  of  the  universe  and  {U)  consequently 
the  total  number  of  observations  should  in  strictness  be  written 
in  the  form — 

(Jf)     =  {UA)  +  (Ua)  =  (UB)  +  {U/3)  =  etc. 

=  {UAB)  +  (UA^)  +  (UaB)  +  (Ua^)  =  etc. 
UA)  ^  (UAB) +  {UAf3)  =  {UAC)  +  (UAy}  =  etc. 
UAB)  =  (UABC)  +  (UABy)  =  etc. 

3.  Clearly,  however,  we  might  have  used  any  other  symbol 
instead  of  U  to  denote  the  attributes  conmion  to  all  the  members 
of  the  universe,  e.g.  A  or  B  or  AB  or  ABC\  writing  in  the  latter 
case — 

(ABC)  =  (ABCD)  +  {ABCS) 

and  80  on.  Hence  anj/  attribute  or  combination  of  attributes 
coinmon  to  all  the  class-symbols  in  an  equation  may  be  regarded  as 
specifying  the  universe  within  which  the  equation  holds  good. 
Thus  the  equation  just  written  may  be  read  in  words:  "The 
number  of  objects  or  individuals  in  the  universe  ABC  is  equal  to 
the  number  of  i)'s  together  with  the  number  of  not-i)'s  within 
the  same  universe."    The  equation 

{AC)  =  {ABC)  +  {Aj^C) 

may  be  read  :  "The  number  of  ^'s  is  equal  to  the  number  of  J's 
that  are  B  together  with  the  number  of  ^'s  that  are  not-7i 
within  the  universe  C." 

4.  The  more  complex  may  be  derived  from  the  simpler  relations 
between  class-frequencies  very  readily  by  the  process  of  specifying 
the  universe.    Thus  starting  from  the  simple  equation 

{a)  =  N-{A), 

we  have,  by  specifying  the  universe  as  y8, 

{aP)  =  (^)-{Ap) 

=  ]V-(A)-{B)  +  (AB). 

Specifying  the  universe,  again,  as  y,  we  have 

(a^y)  =  (y)-(^y)-(^y)  +  (^^r) 

=  J}^-(A)-  (B)  -  (C)  +  {AB)  +  (AC)  +  (BC)  -  (ABC). 

5.  Any  class-frequencies  which  have  been  or  might  have  been 
observed  within  one  and  the  same  universe  may  be  said  to  be 
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consistent  with  one  another.  They  conform  with  one  another, 
and  do  not  in  any  way  conflict. 

The  conditions  of  consistence  are  some  of  them  simple,  but 
others  are  by  no  means  of  an  intuitive  character.  Suppose,  for 
instance,  the  data  are  given — 

N  1000  (AB)  42 

(A)  525  {AC)  147 

(B)  312  (BC)  86 

(C)  470  (ABC)  25 

— there  is  nothing  obviously  wrong  with  the  figures.  Yet  they 
are  certainly  inconsistent.  They  might  have  been  observed  at 
different  times,  in  different  places  or  on  different  material,  but 
they  cannot  have  been  observed  in  one  and  the  same  universe. 
They  imply,  in  fact,  a  negative  value  for  (a/3y) — 

(aySy)  =  1000  -  525  -  312  -  470  +  42  +  147  +  86  -  25. 
=  1000-  1307  +  275-  25. 
=  -57. 

Clearly  no  class-frequency  can  be  negative.  If  the  figures, 
consequently,  are  alleged  to  be  the  result  of  an  actual  inquiry  in 
a  definite  universe,  there  must  have  been  some  miscount  or 
misprint. 

6.  Generally,  then,  we  may  say  that  any  given  class-frequencies 
are  inconsistent  if  they  imply  negative  values  for  any  of  the 
unstated  frequencies.  Otherwise  they  are  consistent.  To  test  the 
consistence  of  any  set  of  2"  algebraically  independent  frequencies, 
for  the  case  of  n  attributes,  we  should  accordingly  calculate 
the  values  of  all  the  unstated  frequencies,  and  so  verify  the  fact 
that  they  are  positive.  This  procedure  may,  however,  be  limited 
by  a  simple  consideration.  If  the  ultimate  class-frequencies  are 
positive,  all  others  must  be  so,  being  derived  from  the  ultimate 
frequencies  by  simple  addition.  Hence  we  need  only  calculate 
the  values  of  the  ultimate  class-frequencies  in  terms  of  those 
given,  and  verify  the  fact  that  they  exceed  zero. 

7.  As  we  saw  in  the  last  chapter,  there  are  two  sets  of  2" 
algebraically  independent  frequencies  of  practical  importance,  viz. 
(1)  the  ultimate,  (2)  the  positive  class-frequencies. 

It  follows  from  what  we  have  just  said  that  there  is  only  one 
condition  of  consistence  for  the  ultimate  frequencies,  viz.  that 
they  must  all  exceed  zero.  Apart  from  this,  any  one  frequency  of 
the  set  may  vary  anywhere  between  0  and  co  without  becoming 
inconsistent  with  the  others. 

For  the  positive   class-frequencies,    the   conditions   may  be 
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expressed  symbolically  by  expanding  the  ultimate  in  terms  of 
the  positive  frequencies,  and  writing  each  such  expansion  not 
less  than  zero.  We  will  consider  the  cases  of  one,  two,  and 
three  attributes  in  turn. 

8.  If  only  one  attribute  be  noted,  say  A,  the  positive  frequencies 
are  N  and  (A).    The  ultimate  frequencies  are  {A)  and  (a),  where 

The  conditions  of  consistence  are  therefore  simply 

{A)^0  JV-{A)<iO 

or,  more  conveniently  expressed, 

(a)    (A)<^0  (b)    {A)^N  .       .       .  (1) 

These  conditions  are  obvious :  the  number  of  J.'s  cannot  be  less 
than  zero,  nor  exceed  the  whole  number  of  observations. 

9.  If  two  attributes  be  noted  there  are  four  ultimate  frequencies 
{AB),  (A (3),  (aB),  {af3).  The  following  conditions  are  given  by 
expanding  each  in  terms  of  the  frequencies  of  positive  classes — 

(a)  (AB)<^0  or  (AB)  would  be  negative  ) 

(b)  {AB)M^)  +  {B)-N     (af3)       „  „  I 

(c)  {AB)1^{A)  „  {A(3)      „  „  I 

(d)  {AB)1^{B)  „  {aB)      „  „  ) 

(a),  (c),  and  {d)  are  obvious ;  (6)  is  perhaps  a  little  less  obvious, 
and  is  occasionally  forgotten.  It  is,  however,  of  precisely  the 
same  type  as  the  other  three.  None  of  these  conditions  are 
really  of  a  new  form,  but  may  be  derived  at  once  from  (1)  (a)  and 
(1)  (6)  by  specifying  the  universe  as  B  or  as  respectively.  The 
conditions  (2)  are  therefore  really  covered  by  (1). 

10.  But  a  further  point  arises  as  regards  such  a  system  of 
limits  as  is  given  by  (2).  The  conditions  {a)  and  (6)  give  lower  or 
minor  limits  to  the  value  of  {AB) ;  (c)  and  {d)  give  upper  or 
major  limits.  If  either  major  limit  be  less  than  either  minor  limit 
the  conditions  are  impossible,  and  it  is  necessary  to  see  whether 
(A)  and  {B)  can  take  such  values  that  this  may  be  the  case. 

Expressing  the  condition  that  the  major  limits  nuist  be  not  less 
than  the  minor,  we  have — 

These  are  simply  the  conditions  of  the  form  (1).  If,  therefore. 
(.4)  and  {B)  fulfil  the  conditions  (1),  the  conditions  (2)  must  be 
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possible.  The  conditions  (1)  and  (2)  therefore  give  all  the  con- 
ditions of  consistence  for  the  case  of  two  attributes,  conditions  of 
an  extremely  simple  and  obvious  kind. 

11.  Now  consider  the  case  of  three  attributes.  There  are 
eight  ultimate  frequencies.  Expanding  the  ultimate  in  terms  of 
the  positive  frequencies,  and  expressing  the  condition  that  each 
expansion  is  not  less  than  zero,  we  have — 

or  the  frequency  given  below 


will  be  negative. 

{a)  {ABC)<iO 

(ABCy 

(b)           ^{AB)  +  {AC)-(A) 

(c)  <^{AB)-\-(BC)-{B) 

{a£y) 

(d)          <^{AG)  +  {BC)-(G) 

(a/SC) 

(e) 

(A£y) 

(/)  >^C) 

(AI30) 

ig)  >{BC) 

(h)       ::f>iAB)-{-{AC)  +  (Ba)~{A) 

(3) 


These,  again,  are  not  conditions  of  a  new  form.  We  leave  it 
as  an  exercise  for  the  student  to  show  that  they  may  be  derived 
from  (1)  (a)  and  (1)  (b)  by  specifying  the  universe  in  turn  as 
BC,  By,  (3C,  and  (Sy.  The  two  conditions  holding  in  fou7'  universes 
give  the  eight  inequalities  above. 

12.  As  in  the  last  case,  however,  these  conditions  will  be  im- 
possible to  fulfil  if  any  one  of  the  major  limits  (e)-{h)  be  less  than 
any  one  of  the  minor  limits  {a)-{d).  The  values  on  the  right 
must  be  such  as  to  make  no  major  limit  less  than  a  minor. 

There  are  four  major  and  four  minor  limits,  or  sixteen  compari- 
sons in  all  to  be  made.  But  twelve  of  these,  the  student  will 
find,  only  lead  back  to  conditions  of  the  form  (2)  for  (AB),  {AC), 
and  (BC)  respectively.  The  four  comparisons  of  expansions  due 
to  contrary  frequencies  (  (a)  and  (h),  (6)  and  (g),  (c)  and  (/),  {d) 
and  (e)  )  alone  lead  to  new  conditions,  viz. — 

(a)  (AB)  +  (AC)  +  (BC)  < (A)  +  (B)  +  (C)  -  J^' 

(b)  {AB)  +  {AC)-{BC)1^{A) 

(c)  (AB)-(AC)  +  {BC):^(B) 

(d)  -(AB)  +  {AC)  +  {BC):^(C) 

13.  These  are  conditions  of  a  wholly  new  type,  not  derivable 
in  any  way  from  those  given  under  (1)  and  (2).  They  are  con- 
ditions for  the  consistence  of  the  second-order  frequencies  with 
each  other,  w^hilst  the  inequalities  of  the  form  (2)  are  only  conditions 
for  the  consistence  of  the  second-order  frequencies  with  those  of 
lower  orders.    Given  any  two  of  the  second-order  frequencies,  e.g. 
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(AB)  and  (AC),  the  conditions  (4)  give  limits  for  the  third,  viz. 
(7>C).  They  thus  replace,  for  statistical  purposes,  the  ordinary 
rules  of  syllogistic  inference.  From  data  of  the  syllogistic  form, 
they  would,  of  course,  lead  to  the  same  conclusion,  though  in  a 
somewhat  cumbrous  fashion  ;  one  or  two  cases  are  suggested  as 
exercises  for  the  student  (Questions  6  and  7).  The  following 
will  serve  as  illustrations  of  the  statistical  uses  of  the  con- 
ditions : — 

Example  i. — Given  that  (^)  =  (7?)  =  (C)  =  JiV  and  80  per  cent, 
of  the  ^'s  are  B,  75  per  cent,  of  ^'s  are  C,  find  the  limits  to  the 
percentage  of  ^'s  that  are  C.    The  data  are — 

?(^  =  0.8  2-(f)  =  0-75 

and  the  conditions  give  — 

(a)  „08  -0-75 

(6)  <{:0-8  +  0-75  -  1 

(c)  >>1    -0-8  +0-75 

(d)  >1    +0-8  -0-75 

(a)  gives  a  negative  limit  and  (d)  a  limit  greater  than  unity ; 
hence  they  may  be  disregarded.    From  (6)  and  (c)  we  have — 

— that  is  to  say,  not  less  than  55  per  cent,  nor  more  than  95  per 
cent,  of  the  i?'s  can  be  C. 

Example  ii. — If  a  report  give  the  following  frequencies  as 
actually  observed,  show  that  there  must  be  a  misprint  or  mistake 
of  some  sort,  and  that  possibly  the  misprint  consists  in  the 
dropping  of  a  1  before  the  85  given  as  the  frequency  (BC). 

JV  1000 

(A)  510             (AB)  189 

(B)  490             {AC)  140 

(C)  427             (BC)  85 

From  (4)  (a)  we  have — 

(^C) <|:510  +  490  +  427  -  1000  -  189  -  140 
<98. 

But  85<98,  therefore  it  cannot  be  the  correct  value  of  (BC). 
If  we  read  185  for  85  all  the  conditions  are  fulfilled. 
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Example  iii. — In  a  certain  set  of  1000  observations  (i4)  =  45, 
(^)  =  23,  (C)  =  14.  Show  that  whatever  the  percentages  of  ^'s 
that  are  A  and  of  C's  that  are  it  cannot  be  inferred  that  any  ^'s 
are  C. 

The  conditions  (a)  and  {b)  give  the  lower  limit  of  {BG)y  which 
is  required.    We  find — 

ia)    (-^)^  Ji^)J^)-.918 

^  ^      N  ^      N  ^  N      ^  ' 

The  first  limit  is  clearly  negative.  The  second  must  also  be 
negative,  since  {AB)/JV  cannot  exceed  -023  nor  {AC)/JV  '014. 
Hence  we  cannot  conclude  that  there  is  any  limit  to  (BC)  greater 
than  0.  This  result  is  indeed  immediately  obvious  when  we 
consider  that,  even  if  all  the  ^'s  were  A,  and  of  the  remainmg 
22  ^'s  14  were  C's,  there  would  still  be  8  ^'s  that  were  neither 
B  nor  C. 

14.  The  student  should  note  the  result  of  the  last  example,  as  it 
illustrates  the  sort  of  result  at  which  one  may  often  arrive  by 
applying  the  conditions  (4)  to  practical  statistics.  For  given 
values  of  (A),  (B),  (C),  (AB),  and  (^C),  it  will  often  happen 
that  any  value  of  {B(J)  not  less  than  zero  (or,  more  generally,  not 
less  than  either  of  the  lower  limits  (2)  (a)  and  (2)  [b) )  will  satisfy 
the  conditions  (4),  and  hence  no  true  inference  of  a  lower  limit  is 
possible.  The  argument  of  the  type  "So  many  ^'s  are  B  and 
so  many  ^'s  are  C  that  we  must  expect  some  ^'s  to  be  C  "  must 
be  used  with  caution. 
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Definite  Syllogism  "). 

(2)  Boole,  G.,  Laws  of  Thought,  1854  (chapter  xix.,  "Of  Statistical  Condi- 

tions"). 

The  above  are  the  classical  works  with  rcs])ect  to  the  general  theory 
of  numerical  consistence.  The  student  will  find  both  difficult  to  follow 
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(3)  Yule,  G.  U.,  "On  the  Theory  of  Consistence  of  Logical  Class-frequencies 

and  its  Geometrical  Representation,"  Phil.  Trans.,  A,  vol.  cxcvii. 
(1901),  p.  91.  (Deals  at  length  with  the  theory  of  consistence  for 
any  number  of  attributes,  using  the  notation  of  the  present  chapters.) 
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EXERCISES. 


1.  (For  this  and  similar  estimates  cf.  ''Report  by  Miss  Collet  on  the 
Statistics  of  Employment  of  Women  and  Girls  "  [C— 7564]  1894).  If,  in  the 
urban  district  of  Bury,  817  per  thousand  of  the  women  between  20  and  25 
years  of  age  were  returned  as  "  occupied"  at  the  census  of  1891,  and  263  per 
thousand  as  married  or  widowed,  what  is  the  lowest  proportion  per  thousand 
of  the  married  or  widowed  that  must  have  been  occupied  ? 

2.  If,  in  a  series  of  houses  actually  invaded  by  small-pox,  70  per  cent,  of  the 
inhabitants  are  attacked  and  85  per  cent,  have  been  vaccinated,  what  is  the 
lowest  pejcentage  of  the  vaccinated  that  must  have  been  attacked  ? 

3.  Given  that  50  per  cent,  of  the  inmates  of  a  workhouse  are  men,  60  per 
cent,  are  "  aged  "  (over  60),  80  per  cent,  non-able-bodied,  35  per  cent,  aged 
men,  45  per  cent,  non-able-bodied  men,  and  42  per  cent,  non-able-bodied  and 
aged,  find  the  greatest  and  least  possible  proportions  of  non-able-bodied  aged 
men. 

4.  (Material  from  ref.  5  of  Chap.  I.)  The  following  are  the  proportions 
per  10,000  of  boys  observed,  with  certain  classes  of  defects  amongst  a  number 
of  school-children.  ^  =development  defects,  -ff=nerve  signs,  Z)  =  mental 
dulness. 


Show  that  some  dull  boys  do  not  exhibit  development  defects,  and  state  how 
many  at  least  do  not  do  so. 

5.  The  following  are  the  corresponding  figures  for  girls : — 


Show  that  some  defectively  developed  girls  are  not  dull,  and  state  how  many 
at  least  must  be  so. 

6.  Take  the  syllogism  "  All  v4's  are  B,  all  j5's  are  C,  therefore  all  ^'s  are 
C,"  ex])ress  the  premisses  in  terms  of  the  notation  of  the  preceding  chapters, 
and  deduce  the  conclusion  by  the  use  of  the  general  conditions  of  consistence. 

7.  Do  the  same  for  the  syllogism  "All  ^'s  are  j5,  no  B'^  are  C,  therefore 
no  .<4's  are  C." 

8.  Given  that  {A)  =  {B)  =  {C)  =  \N,  and  that  {AB)IN={AC)IN=p,  find 
what  must  be  the  greatest  or  least  values  of  p  in  order  that  we  may  infer 
that  (.5C)/iV  exceeds  any  given  value,  say  q. 

9.  Show  that  if 


iV^  =10,000 
{A)=  877 
{B)=  1,086 


{D)  =789 
(J^)  =  338 
{BD)  =  455 


N  =10,000 
{A)=  682 
{B)=  850 


[D)  =680 
(^^)  =  248 
(i?Z))  =  363 


and 


{AB) _{AC)  _{BC) 
N  ~^N^~  N 


=y, 


the  value  of  neither  x  nor  y  can  exceed  J. 


CHAPTER  III. 


ASSOCIATION. 

1-4,  The  criterion  of  independence. — 5-10.  The  conception  of  association  and 
testing  for  the  same  by  the  comparison  of  percentages — 11-12. 
Numerical  equality  of  the  differences  between  the  four  second-order 
frequencies  and  their  independence  vahies — 13.  Coefficients  of  associa- 
tion— 14.  Necessity  for  an  investigation  into  the  causation  of  an 
attribute  A  being  extended  to  include  non-^'s. 

1.  If  there  is  no  sort  of  relationship,  of  any  kind,  between  two 
attributes  A  and  we  expect  to  find  the  same  proportion  of  -4's 
amongst  the  j5's  as  amongst  the  non-^'s.  .  We  may  anticipate, 
for  instance,  the  same  proportion  of  abnormally  wet  seasons  in 
leap  years  as  in  ordinary  years,  the  same  proportion  of  male  to 
total  births  when  the  moon  is  waxing  as  when  it  is  waning,  the 
same  proportion  of  heads  whether  a  coin  be  tossed  with  the  right 
hand  or  the  left. 

Two  such  unrelated  attributes  may  be  termed  independent,  and 
we  have  accordingly  as  the  criterion  of  independence  for  A  and  B — 

(AB)  JAp) 

(J')   m ^ ' 

If  this  relation  hold  good,  the  corresponding  relations 
(aB)  Jal) 

(AB)JaB) 
(^)  W 

(A)      (o)  • 

must  also  hold.    For  it  follows  at  once  from  (1)  that — 
(B)-{AB)_(^)-{AIS) 
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that  is  (g^)  (a/3) 

WW 

and  the  other  two  identities  may  be  similarly  deduced. 

The  student  may  find  it  easier  to  grasp  the  nature  of  the  lela- 
tions  stated  if  the  frequencies  are  supposed  grouped  into  a  table 
with  two  rows  and  two  columns,  thus  : — 


Attribute. 

Attribute. 

Total. 

B 

A 

{AB) 

{A) 

a 

{aB) 

(a) 

Total 

(3) 

Equation  (1)  states  a  certain  equality  for  the  columns  ;  if  this 
holds  good,  the  corresponding  equation 

(AB)  (aB) 

(A)  («) 

must  hold  for  the  rows,  and  so  on. 

2.  The  criterion  may,   however,   be   put  into  a  somewhat 
different  and  theoretically  more  convenient  form.    The  equation 
(1)  expresses  (AB)  in  terms  of  (B),  (^),  and  a  second-order  fre- 
quency (Ap);  eliminating  this  second-order  frequency  we  have — 
(AB)^{AB)  +  (A[i)_(A) 
{B)      (B)  +  {fS) 

i.e.  in  words,  "  the  proportion  of        amongst  the  B's  is  the  same 
as  in  the  universe  at  large."    The  student  should  learn  to  recog- 
nise this  equation  at  sight  in  any  of  the  forms — 
(AB)  (A) 
N 

(A){B) 
{AB)JA)  (B) 

The  equation  (d)  gives  the  important  fundamental  rule  :  If  the  attri- 
butes A  and  B  are  independent,  the  proportion  of  AB's  in  the  universe 
is  equal  to  the  proportion  of  ^'s  multiplied  by  the  proportion  of  B  s. 


(B) 
(AB) 

(AB) 


(a) 

(<•) 
(d) 


(2) 
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The  advantage  of  the  forms  (2)  over  the  form  (1)  is  that  they 
give  expressions  for  the  second-order  frequency  in  terms  of  the 
frequencies  of  the  first  order  and  the  whole  number  of  observa- 
tions alone  ;  the  form  (1)  does  not. 

Example  i. — If  tliere  are  144  ^'s  and  384  ^'s  in  1024  observa- 
tions, how  many  ^^'s  will  there  be,  A  and  B  being  independent? 

144  x  384_g. 
1024 

There  will  therefore  be  54  AB\. 

Example  ii. — If  the  ^'s  are  60  per  cent.,  the  j5's  35  per  cent.,  of 
the  whole  number  of  observations,  w^hat  must  be  the  percentage 
of  AB'^  in  order  that  we  may  conclude  that  A  and  B  are 
independent? 

60x  35  ^, 
~T0^^^^' 

and  therefore  there  must  be  21  per  cent,  (more  or  less  closely,  cf. 
§§  7,  8  below)  of  ^^'s  in  the  universe  to  justify  the  conclusion 
that  A  and  B  are  independent. 

3.  It  follows  from  §  1  that  if  the  relation  (2)  holds  for  any  one 
of  the  four  second- order  frequencies,  e.g.  (AB),  similar  relations 
must  hold  for  the  remaining  three.  Thus  we  have  directly 
from  (1) — 

(A/})    {AB)  +  iAI3)  (A) 
iP)       {B)  +  {/i)  N' 

givmg 

and  so  on.  This  is  seen  at  once  to  be  true  on  consideration 
of  the  fourfold  table  on  p.  26.  For  if  {AB)  takes  the  value 
{A){B)IN,  (AfS)  must  take  the  value  to  keep  the  total 

of  the  row  equal  to  (A),  and  so  on  for  the  other  rows  and  columns. 
The  fourfold  table  in  the  case  of  independence  must  in  fact  have 
the  form  — 


Attribute. 

Attribute. 

Total. 

B 

A 

{A)[D)IN 

{Am/N 

{A) 

a 

(a) 

Total 

<J3) 

N 
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Example  iii. — In  Example  i.  above,  what  would  be  the  number 
of  a/5's,  A  and  B  being  independent  1 

(a)=  1024-  144  =  880 
1024 -384  =  640 
.     .  ...    880  x  640  ..^ 
<^^)^^-l02^  =  ^^^- 

4.  Finally,  the  criterion  of  independence  may  be  expressed  in 
yet  a  third  form,  viz.  in  terms  of  the  second-order  frequencies 
alone  If  A  and  B  are  independent,  it  follows  at  once  from  the 
preceding  section  that — 

(AB)(ali)JAMMm. 

And  evidently  {aB)(A/3)  is  equal  to  the  same  fraction. 
Therefore — 

{AB)(a(3)  =  {aB){AI3) 
(AB)    ^  (Al) 

(aB)  [aft)  .         .         .  (3) 

(AB)  ^  (aB) 
{AIB)  (a^) 

The  equation  (6)  may  be  read  "The  ratio  of  yl's  to  a's  amongst 
the  ^'s  is  equal  to  the  ratio  of  ^'s  to  a's  amongst  the  /3''s,"  and 
(c)  similarly. 

This  form  of  criterion  is  a  convenient  one  if  all  the  four  second- 
order  frequencies  are  given,  enabling  one  to  recognise  almost  at  a 
glance  whether  or  not  the  two  attributes  are  independent. 

Example  iv. — If  the  second-order  frequencies  have  the  following 
values,  are  A  and  B  independent  or  not  ? 

(^5)  =110       (a^)  =  90       (^/S)  =  290       (a/^)  =  510. 
Clearly  (^1  B){a[i)  >  (aB){Aft), 

so  A  and  B  are  not  independent. 

5.  Suppose  now  that  A  and  B  are  not  independent,  but  related 
in  some  way  or  other,  however  complicated. 

Then  if  (^^)>MM), 

A  and  B  are  said  to  be  positively  associated,  or  sometimes  simply 
associated.    If,  on  the  other  hand, 
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A  and  B  are  said  to  be  negatively  associated  or,  more  briefly, 
disassociated. 

The  student  should  notice  that  these  words  are  not  used 
exactly  in  their  ordinary  senses,  but  in  a  technical  sense.  When 
A  and  B  are  said  to  be  associated,  it  is  not  meant  merely  that 
some  A's  are  B's,  but  that  the  number  of  ^'s  which  are  B^s  exceeds 
the  number  to  be  expected  if  A  and  B  are  independent.  Similarly, 
when  A  and  B  are  said  to  be  negatively  associated  or  disassociated, 
it  is  not  meant  that  no  ^'s  are  B's,  but  that  the  number  of  A'a 
tvhich  are  B's  falls  short  of  the  number  to  be  expected  if  A  and  B 
are.  independent.  "  Association  "  cannot  be  inferred  from  the  mere 
fact  that  some  ^'s  are  j5's,  however  great  that  proportion ;  this 
principle  is  fundamental,  and  should  be  always  borne  in  mind. 

6.  The  greatest  possible  value  of  (AB)  for  given  values  of 
(A),  and  {B)  is  either  {A)  or  [B)  (whichever  is  the  less).  When 

{AB)  attains  either  of  these  values,  A  and  B  may  be  said  to  be 
completely  or  perfectly  associated.  The  lowest  possible  value  of 
{AB),  on  the  other  hand,  is  either  zero  or  (^) -j- (^)  —  iV  (which- 
ever is  the  greater).  When  {AB)  falls  to  either  of  these  values, 
A  and  B  may  be  said  to  be  completely  disassociated.  Complete 
association  is  generally  understood  to  correspond  to  one  or  other 
of  the  cases,  "All  ^'s  are  i?"  or  "All  ^'s  are  A"  or  it  may  be 
more  narrowly  defined  as  corresponding  only  to  the  case  when 
both  these  statements  were  true.  Complete  disassociation  may 
be  similarly  taken  as  corresponding  to  one  or  other  of  the  cases. 
"  No  ^'s  are  or  "  no  a's  are  or  more  narrowly  to  the 
case  when  both  these  statements  are  true.  The  greater  the 
divergence  of  {AB)  from  the  value  (^)(i^)/A^  towards  the  limit- 
ing value  in  either  direction,  the  greater,  we  may  say,  is  the 
intensity  of  association  or  of  disassociation,  so  that  we  may  speak 
of  attributes  being  more  or  less,  highly  or  slightly  associated.  This 
conception  of  degrees  of  association,  degrees  which  may  in  fact  be 
measured  by  certain  formulae  {cf.  §  13),  is  important. 

7.  When  the  association  is  very  slight,  i.e.  where  {AB)  only 
differs  from  {A){B)IN  hy  a  few  units  or  by  a  small  proportion,  it 
may  be  that  such  association  is  not  really  significant  of  any 
definite  relationship.  To  give  an  illustration,  suppose  that  a  coin 
is  tossed  a  number  of  times,  and  the  tosses  noted  in  pairs ;  then 
100  pairs  may  give  such  results  as  the  following  (taken  from  an 
actual  record) : — 

First  toss  heads  and  second  heads  ...  26 

,,  ,,     tails  .        .  .18 

First  toss  tails  and  second  heads  .        .  .27 

„     tails  .       ,  .29 
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If  we  use  A  to  denote  "  heads  "  in  the  first  toss,  B  "  heads in 
the  second,  we  have  from  the  above  {A)  =  44,  (B)  =  53.  Hence 

(^)(^)/i\^= '^^^^  23-32,  while  actually  (AB)  is  26.  Hence 

there  is  a  positive  association,  in  the  given  record,  between 
the  result  of  the  first  throw  and  the  result  of  the  second.  But  it 
is  fairly  certain,  from  the  nature  of  the  case,  that  such  association 
cannot  indicate  any  real  connection  between  the  results  of  the 
two  throws ;  it  must  therefore  be  due  merely  to  such  a  complex 
system  of  causes,  impossible  to  analyse,  as  leads,  for  example,  to 
ditferences  between  small  samples  drawn  from  the  same  material. 
The  conclusion  is  confirmed  by  the  fact  that,  of  a  number  of  such 
records,  some  give  a  positive  association  (like  the  above),  but 
others  a  negative  association. 

8.  An  event  due,  like  the  above  occurrence  of  positive  associa- 
tion, to  an  extremely  complex  system  of  causes  of  the  general 
nature  of  which  we  are  aware,  but  of  the  detailed  operation  of 
which  we  are  ignorant,  is  sometimes  said  to  be  due  to  chance,  or 
better  to  the  chances  or  fluctuations  of  sampling. 

A  little  consideration  will  suggest  that  such  associations  due  to 
the  fluctuations  of  sampling  must  be  met  with  in  all  classes  of 
statistics.  To  quote,  for  instance,  from  §  1,  the  two  illustrations 
there  given  of  independent  attributes,  we  know  that  in  any 
actual  record  we  would  not  be  likely  to  find  exactly  the  same 
proportion  of  abnormally  wet  seasons  in  leap  years  as  in  ordinary 
years,  nor  exactly  the  same  proportion  of  male  births  when  the 
moon  is  waxing  as  when  it  is  waning.  But  so  long  as  the  diver- 
gence from  independence  is  not  well  marked  we  must  regard  such 
attributes  as  practically  independent,  or  dependence  as  at  least 
unproved. 

The  discussion  of  the  question,  how  great  the  divergence  must 
be  before  we  can  consider  it  as  "  well  marked,"  must  be  postponed 
to  the  chapters  dealing  with  the  theory  of  sampling.  At  present 
the  attention  of  the  student  can  only  be  directed  to  the  existence 
of  the  difficulty,  and  to  the  serious  risk  of  interpreting  a  "chance 
association  "  as  physically  significant. 

9.  The  definition  of  §  5  suggests  that  we  are  to  test  the 
existence  or  the  intensity  of  association  between  two  attributes 
by  a  comparison  of  the  actual  value  of  [AB)  with  its  independence- 
value  (as  it  may  be  termed)  {A){B)IN.  The  procedure  is  from  the 
theoretical  standpoint  perhaps  the  most  natural,  but  it  is  more 
usual,  and  is  simplest  and  best  in  practice,  to  compare  proportions, 
e.g.  the  proportion  of  ^'s  amongst  the  ^'s  with  the  proportion 
amongst  the  /8's.  Such  proportions  are  usually  expressed  in  the 
form  of  percentages  or  proportions  per  thousand. 
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It  will  be  evident  from  §§  1  and  2  that  a  large  number  of  such 
comparisons  are  available  for  the  purpose,  and  the  question  arises, 
therefore,  which  is  the  best  comparison  to  adopt"? 

10.  Two  principles  should  decide  this  point:  (1)  of  any  two 
comparisons,  that  is  the  better  which  brings  out  the  more  clearly 
the  degree  of  association ;  (2)  of  any  two  comparisons,  that  is  the 
better  which  illustrates  the  more  important  aspect  of  the  problem 
under  discussion. 

The  first  condition  at  once  suggests  that  comparisons  of  the 
form 

(AB)    (A^  , 

w^w   •    •   •    •  ^  ^ 

are  better  than  comparisons  of  the  form 

(AB)  (A) 

(B)^  M        •       '       •  • 

For  it  is  evident  that  if  most  of  the  objects  or  individuals  in  the 
universe  are  ^'s,  i.e.  if  (B)/^  approaches  unity,  (AB)I(B)  will 
necessarily  approach  {A)/JV  even  though  the  difference  between 
(AB)I{B}  and  (^/3)/(/8)  is  considerable.  The  second  form  of 
comparison  may  therefore  be  misleading. 

Setting  aside,  then,  comparisons  of  the  general  form  (b),  the 
question  remains  whether  to  apply  the  comparison  of  the  form  (a) 
to  the  rows  or  the  columns  of  the  table,  if  the  data  are  tabulated 
as  on  p.  26.  This  question  must  be  decided  with  reference  to  the 
second  principle,  i.e.  with  regard  to  the  more  important  aspect  of 
the  problem  under  discussion,  the  exact  question  to  be  answered, 
or  the  hypothesis  to  be  tested,  as  illustrated  by  the  examples 
below.  Where  no  definite  question  has  to  be  answered  or 
hypothesis  tested  both  pairs  of  proportions  may  be  tabulated, 
as  in  Example  vi. 

Example  v. — Association  between  inoculation  against  cholera 
and  exemption  from  attack.  (Data  from  Greenwood  and  Yule, 
Table  III.,  ref.  6.) 


Not  attacked. 

Attacked. 

Total. 

Inoculated  . 

276 

3 

279 

Not  inoculated  . 

473 

66 

539 

Total  . 

749 

69 

818 
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Here  the  important  question  is,  How  far  does  inoculation 
protect  from  attack  1   The  most  natural  comparison  is  therefore — 

Percentage  of       inoculated  who  were  not  attacked  .    98  9 
„  not  inoculated  „  „  .    87  "8 

or  we  might  tabulate  the  complementary  proportions — 

Percentage  of       inoculated  who  were  attacked.       .  1-1 
„  not  inoculated        „  .       •  12-2 

Either  comparison  brings  out  simply  and  clearly  the  fact  that 
inoculation  and  exemption  from  attack  are  positively  associated 
{ijioculation  and  attack  negatively  associated). 

We  are  making  above  a  comparison  by  rows  in  the  notation  of 
the  table  on  p.  26,  comparing  with  (a5)/(a),  or(^^)/(^) 

with  {aj3)l{a).  A  comparison  by  columns,  e.g.  {AB)/(B)  with 
{Afi)l{fi),  would  serve  equally  to  indicate  whether  there  was  any 
appreciable  association,  but  would  not  answer  directly  the 
particular  question  we  have  in  mind  : — 

Percentage  of  not-attacked  who  were  inoculated  .       .  36'8 
„  attacked  „  „         .       .  4-3 

Example  vi. — Deaf-mutism  and  Imbecility.  (Material  from 
Census  of  1901.    Summary  Tables.    [Cd.  1523.]) 

Total  population  of  England  and  Wales    .       .  32,528,000 

Number  of  the  imbecile  (or  feeble-minded)       .  48,882 

Number  of  deaf-mutes     .....  15,246 

Number  of  imbecile  deaf-mutes       ...  451 

Required,  to  find  whether  deaf-mutism  is  associated  with 
imbecility. 

We  may  denote  the  number  of  the  imbecile  by  {A),  of  deaf- 
mutes  by  {B).  A  comparison  of  {AB)/{B)  with  {A)IjV  or  of 
{AB)I(A)  with  (^)/iV  may  very  well  be  used  in  this  case,  seeing 
that  (il)/iV  anil  (^)/A^  are  both  small.  The  question  whether  to 
give  the  preference  to  the  first  or  the  second  comparison  depends 
on  the  nature  of  the  investigation  we  wish  to  make.  If  it  is 
desired  to  exhibit  the  conditions  among  deaf-mutes  the  first  may 
be  used : — 

Proportion  of  imbeciles  among  deaf- 1  on  p  4 
*-       /  A-D\i/-D\  }  2y  b  per  thousand, 

mutes  =  {AB)/(B)      .       .       .  j  ^ 

Proportion  of  imbeciles  in  the  whole  \  ^ 

population  =  (^)/A^  .       .       .j  " 
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If,  on  the  other  hand,  it  is  desired  to  exhibit  the  conditions 
amongst  the  imbecile,  the  second  will  be  preferable. 

Proportion  of  deaf-mutes  amongst  1  «  «       .  v  j 
the  imbecile  .    °  .  |  9-2  per  thousand. 

Proportion  of  deaf-mutes  in  the  |  q.^ 

whole  population  {B)IN    .       .  /  ** 

Either  comparison  exhibits  very  clearly  the  high  degree  of  asso- 
ciation between  the  attributes.  It  may  be  pointed  out,  however, 
that  census  data  as  to  such  infirmities  are  very  untrustworthy. 

Example  vii. — Eye-colour  of  father  and  son  (material  due 
to  Sir  Francis  Galton,  as  given  by  Professor  Karl  Pearson,  Phil. 
Trans ,  A,  vol.  cxcv.  (1900),  p.  138;  the  classes  1,  2,  and  3  of  the 
memoir  treated  as  light). 

Fathers  with  light  eyes  and  sons  with  light  eyes  (AB)       .  471 
„  not  light     „    (AfS)        .  151 

„       not  light  „  light  „    (a^)         .  148 

not  light     „    {al3)        .  230 

Required  to  find  whether  the  colour  of  the  son's  eyes  is 
associated  with  that  of  the  father's.  In  cases  of  this  kind  the 
father  is  reckoned  once  for  each  son ;  e.g.  a  family  in  which  the 
father  was  light-eyed,  two  sons  light-eyed  and  one  not,  would  be 
reckoned  as  giving  two  to  the  class  AB  and  one  to  the  class  A/S. 

The  best  comparison  here  is — 

Percentage  of  light-eyed  amongst  the  sons  \        gj.  ^^^4- 
of  light-eyed  fathers    .       .       .       .  j  ' 

Percentage  of  light-eyed  amongst  the  sons  1  ^9 
of  not-light-eyed  fathers      .       .       .  j  " 

But  the  following  is  equally  valid — 

Percentage  of   light-eyed  amongst  the  "I  ^^^^ 
fathers  of  light-eyed  sons    .       .       .  j  ' 

Percentage  of  light-eyed   amongst  the  \ 

fathers  of  not-light-eyed  sons      .       .  j  " 

The  reason  why  the  former  comparison  is  preferable  is,  that  we 
usually  wish  to  estimate  the  character  of  offspring  from  that  of 
the  parents,  and  define  heredity  in  terms  of  the  resemblance  of 
offspring  to  parents.  We  do  not,  as  a  rule,  want  to  make  use  of 
the  power  of  estimating  the  character  of  parents  from  that  of  their 
offspring,  nor  do  we  define  heredity  in  terms  of  the  resemblance 
of  parents  to  ofl'spring.     Both  modes  of  statement,  however, 
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indicate  equally  clearly  the  tendency  to  resemblance  between 
father  and  son. 

Example  viii.  — Association  between  inoculation  against  cholera 
and  exemption  from  attack,  five  separate  epidemics  (c/.  Example 
v.,  data  from  Tables  IX.,  X.,  XXVIII.,  XXIX.,  XXXI.  of 
reference  6). 


Inoculated 
Not  inoculated  . 

Not  Attacked. 
192 
113 

Attacked. 
4 
34 

Total. 
196 
147 

Total  . 

305 

38 

343 

Inoculated 
Not  inoculated  . 

Not  Attacked 
5,751 
6,351 

Attacked. 
27 
198 

Total. 
5,778 
6,549 

Total  . 

.  12,102 

225 

12,327 

Inoculated 
Not  inoculated  . 

Not  Attacked, 
4,087 
.  113,856 

Attacked. 
5 

1,144 

Total. 
4,092 
115,000 

Total  . 

.  117,943 

1,149 

119,092 

Inoculated 

Not  inoculated  . 

Not  Attacked. 
8,332 
.  84,444 

Attacked. 
8 

556 

Total. 
8,340 
85,000 

Total  . 

.  92,776 

564 

93,340 

Inoculated 
Not  inoculated  . 

Not  Attacked. 
4,870 
.  153,096 

Attacked. 
5 

904 

Total. 
4,875 
154,000 

Total  . 

.  157,966 

909 

158,875 

With  the  table  of  Example  v.  the  above  give  data  for  six 
separate  epidemics,  in  all  of  which  the  same  method  of  inocula- 
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tion  appears  to  have  been  used :  the  data  refer  to  natives  only, 
and  the  numbers  of  observations  are  sufficiently  large  to  reduce 
"fluctuations  of  sampling"  within  reasonably  narrow  limits. 
The  proportions  not  attacked  are  as  follows  :— 


Proportion  not  Attacked. 


Not  Inoculated. 

Inoculated. 

Difference. 

1  . 

.  0-8776 

0-9892 

0-1116 

2  . 

.  0-7687 

0-9796 

0-2109 

3  . 

.  0-9698 

0-9953 

0-0255 

4  , 

.  0-9901 

0-9988 

0-0087 

5  . 

.  0-9935 

0-9990 

0-0055 

6  . 

.  0-9941 

0-9990 

0-0049 

In  each  case  inoculation  and  exemption  from  attach  are  positively 
associated,  but  it  will  be  seen  that  the  several  proportions,  and 
the  differences  between  them,  vary  considerably.  Evidently  in 
a  very  mild  epidemic  this  difference  can  only  be  small,  and  the 
question  arises  how  far  the  data  for  the  separate  epidemics  can 
be  said  to  be  consistent  in  their  indication  of  the  "efficiency" 
of  the  inoculation.  This  is  not  a  simple  question  to  answer ; 
the  more  advanced  student  is  referred  to  the  discussion  in  the 
original. 

11.  The  values  that  the  four  second-order  frequencies  take  in 
the  case  of  independence,  viz. — 

{A){B)   {a)(B)   (Am  (am 
N  '      lY  '      JSr  '      JSf  ' 

are  of  such  great  theoretical  importance,  and  of  so  much  use 
as  reference -values  for  comparing  with  the  actual  values  of 
the  frequencies  (AB)  (aB)  (Af^)  and  (a/3),  that  it  is  often  desir- 
able to  employ  single  symbols  to  denote  them.  We  shall  use 
the  symbols — 
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If  S  denote  the  excess  of  (AB)  over  (AB)q,  then,  in  order  to  keep 
the  totals  of  rows  and  columns  constant,  the  general  table 
(cf.  the  table  for  the  case  of  independence  on  p.  27)  must 
be  of  the  form 


Attribute. 

Attribute. 

Total. 

B 

A 

(^^)o  +  5 

(A) 

a 

(a5)o-5 

(a3)o  +  S 

(a) 

Total 

()8) 

N 

Therefore,  quite  generally  we  have — 

(AB)  -  (AB),  =  (a/3)  -  (aiS)„  =  (^/3)„  -  {Afi)  =  (aB),  -  (aB). 

12.  The  value  of  this  common  difference  8  may  be  expressed 
in  a  form  that  is  useful  to  note.    We  have  by  definition — 

Z^(AB)-{AB),  =  (AB)-^^. 

Bring  the  terms  on  the  right  to  a  common  denominator,  and 
express  all  the  frequencies  of  the  numerator  in  terms  of  those  of 
the  second  order ;  then  we  have — 

„  1  ((AB)[(AB)  +  (aB)  +  (AI3)  +  {al3)]^ 
*  -  F  \  -  [{AB)  +  {AI3)][(AB)  +  {aB)]  ] 

=  -l{(JS)(a^)-(aB)(J/3)}. 

That  is  to  say,  the  common  difference  is  equal  to  1 J  Nth.  of  the 
difference  of  the  "  cross  products  "  {AB){a^)  and  {aB){Af3). 

It  is  evident  that  the  difference  of  the  cross-products  may  be 
very  large  if  ^  be  large,  although  8  is  really  very  small.  In 
using  the  difference  of  the  cross-products  to  test  mentally  the 
sign  of  the  association  in  a  case  where  all  the  four  second -order 
frequencies  are  given,  this  should  be  remembered :  the  difference 
should  be  compared  with  J^,  or  it  will  be  liable  to  suggest  a  higher 
degree  of  association  than  actually  exists. 

Example  ix. — The  following  data  were  observed  for  hybrids  of 
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Datura  (W.  Bateson  and  Miss  Saunders,  Report  to  the  Evolution 
Committee  of  the  Royal  Society,  1902) : — 

Flowers  violet,  fruits  prickly  {AB)  .  .47 

„  „     smooth  {Ap)  .  .12 

Flowers  white,  prickly  (a^)  .  .21 

„  „     smooth  (a/3)  .  .  3 

Investigate  the  association  between  colour  of  flower  and  char- 
acter of  fruit. 

Since  3  x  47  =  141,  12  x  21  =  252,  i.e.  (AB)  (al3)<(aB)  [A^), 
there  is  clearly  a  negative  association;  252  -  141  =  111,  and  at 
first  sight  this  considerable  difference  is  apt  to  suggest  a  consider- 
able association.  But  8  =  111/83  =  1*3  only,  so  that  in  point  of 
fact  the  association  is  small,  so  small  that  no  stress  can  be  laid 
on  it  as  indicating  anything  but  a  fluctuation  of  sampling. 
Working  out  the  percentages  we  have — 

Percentaere  of  violet-flowered  plants  with  I 

prickly  fruits      ...       .  |  80  per  cent. 


Percentage  of  white-flowered  plants  with 
prickly  fruits  


»> 


13.  While  the  methods  used  in  the  preceding  pages  suffice  for 
nearly  all  practical  purposes,  it  may  be  convenient  to  measure 
the  intensities  of  association  in  different  cases  by  means  of  some 
formula  or  "  coefficient,"  so  devised  as  to  be  zero  when  the  attri- 
butes are  independent,  + 1  when  they  are  completely  associated, 
and  —  1  when  they  are  completely  disassociated,  in  the  sense  of 
§  6.  If  we  use  the  term  "complete  association"  in  the  wider 
sense  there  defined,  we  have,  grouping  the  frequencies  in  fourfold 
tables,  the  three  cases  of  complete  association  : — 


(1) 


(2) 


(3) 


{AB) 

0 

{A) 

(«i8) 

(«) 

{B) 

()8) 

N 

{AB) 

{m 

(A) 

0 

(«J8) 

(«) 

{B) 

(iS) 

N 

{AB) 

0 

{A) 

0 

(a)8) 

(«) 

(B) 

()8) 

N 

In  the  first  case  all  ^'s  are  B,  and  so  (A/3)  =  0 ;  in  the  second 
all  ^'s  are  A  and  so  (a^)  =  0  ;  and  in  the  third  case  we  have  (A)  = 
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(B)  =  {AB),  so  that  all  A's  are  B  and  also  all  B'b  are  A.  The 
three  corresponding  cases  of  complete  disassociation  are — 
(4)  (5)  (6) 


0 

{A) 

(aB) 

(aB) 

(a) 

()3) 

(AB) 

U) 

(aB) 

0 

(a) 

(B) 

N 

0 

{AB) 

{A) 

{0.B) 

0 

(«) 

iB) 

N 

It  is  required  to  devise  some  formula  which  shall  give  the  value 
+  1  in  the  first  three  cases,  -  1  in  the  second  three,  and  shall 
also  be  zero  where  the  attributes  are  independent.  Many  such 
formulae  may  be  devised,  but  perhaps  the  simplest  possible  (though 
not  necessarily  the  most  advantageous)  is  the  expression — 

^_(AB){ali)-{Ap){aB) 
{AB){ali)  +  {Ali){aBj 

^  m  

(AB){al3)  +  {AI3){aB) 

— where  8  is  the  symbol  used  in  the  two  last  sections  for  the 
difference  {AB)  -  {AB)^.  It  is  evident  that  Q  is  zero  when  the 
attributes  are  independent,  for  then  8  is  zero:  it  takes  the  value  +  1 
when  there  is  complete  association,  for  then  the  second  term  in 
both  numerator  and  denominator  of  the  first  form  of  the  expression 
is  zero :  similarly  it  is  -  1  where  there  is  complete  disassociation, 
for  then  the  first  term  in  both  numerator  and  denominator  is 
zero.  Q  may  accordingly  be  termed  a  coefficient  of  association. 
As  illustrations  of  the  values  it  will  take  in  certain  cases,  the 
association  between  deaf-mutism  and  imbecility,  on  the  basis  of  the 
English  census  figures  (Example  vi.)  is  +0'91  ;  between  light  eye 
colour  in  father  and  in  son  (Example  vii.)  +  0'66  ;  between  colour  of 
flower  and  prickliness  of  fruit  in  Datura  (Example  ix.)  -  0'28,  an 
association  which,  however,  as  already  stated,  is  probably  of  no 
practical  significance  and  due  to  mere  fluctuations  of  sampling. 

The  student  should  note  that  the  value  of  Q  for  a  given  table 
is  unaltered  by  multiplying  either  a  row  or  a  column  by  any 
arbitrary  number,  i.e.  the  value  is  independent  of  the  relative 
proportions  of  ^'s  and  a's  included  in  the  table.  This  property 
is  of  importance,  and  renders  such  a  measure  of  association 
specially  adapted  to  cases  {e.g.  experiments)  in  which  the  propor- 
tions are  arbitrary,  A  form  possessing  the  same  property  but 
certain  marked  advantages  over  Q  is  suggested  in  ref.  (3). 
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The  coefficient  is  ooly  mentioned  here  to  direct  the  attention 
of  the  student  to  the  possibility  of  forming  such  a  measure  of 
association,  a  measure  which  serves  a  similar  purpose  in  the  case 
of  attributes  to  that  served  by  certain  other  coefficients  in  the 
cases  of  manifold  classification  (cf.  Chap.  V.)  and  of  variables 
(c/.  Chap.  IX.,  and  the  references  to  Chaps.  X.  and  XVI.).  For 
further  illustrations  of  the  use  of  this  coefficient  the  reader  is 
referred  to  the  reference  (1)  at  the  end  of  this  chapter;  for  the 
modified  form  of  the  coefficient,  possessing  the  same  properties 
but  certain  advantages,  to  ref.  (3) ;  and  for  a  mode  of  deducing 
another  coefficient,  based  on  theorems  in  the  theory  of  variables, 
which  has  come  into  more  general  use,  though  in  the  opinion  of 
the  present  writer  its  use  is  of  doubtful  advantage,  to  ref.  (4). 
Reference  should  also  be  made  to  the  coefficient  described  in  §  10 
of  Chap.  XI.  The  question  of  the  best  coefficient  to  use  as  a 
measure  of  association  is  still  the  subject  of  controversy :  for  a 
discussion  the  student  is  referred  to  refs.  (3),  (5),  and  (6). 

14.  In  concluding  this  chapter,  it  may  be  well  to  repeat,  for  the 
sake  of  emphasis,  that  {cf.  §  5)  the  mere  fact  of  80,  90,  or  99  per 
cent,  of  ^'s  being  B  implies  nothing  as  to  the  association  of  A 
with  B ;  in  the  absence  of  information,  we  can  but  assume  that 
80,  90,  or  99  per  cent,  of  a's  may  also  be  B.  In  order  to  apply 
the  criterion  of  independence  for  two  attributes  A  and  B,  it  is 
necessary  to  have  information  concerning  a's  and  /?'s  as  well  as 
^'s  and  j5's,  or  concerning  a  universe  that  includes  both  a's  and 
^'s,  /?'s  and  ^'s.  Hence  an  investigation  as  to  the  causal 
relations  of  an  attribute  A  must  not  be  confined  to  ^'s,  but  must 
be  extended  to  a's  (unless,  of  course,  the  necessary  information 
as  to  a's  is  already  obtainable) :  no  comparison  is  otherwise 
possible.  It  would  be  no  use  to  obtain  with  great  pains  the 
result  {cf.  Example  vi.)  that  29 "6  per  thousand  of  deaf-mutes 
were  imbecile  unless  we  knew  that  the  proportion  of  imbeciles 
in  the  whole  population  was  only  1-5  per  thousand;  nor  would 
it  contribute  anything  to  our  knowledge  of  the  heredity  of  deaf- 
mutism  to  find  out  the  proportion  of  deaf-mutes  amongst  the 
offspring  of  deaf-mutes  unless  the  proportions  amongst  the  off- 
spring of  normal  individuals  were  also  investigated  or  known. 
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coefficient  of  §  13  given  which  possesses  marked  advantages.) 

(4)  Pearson,  Karl,  "On  the  Correlation  of  Characters  not  Quantitatively 

Measurable,"  PAtZ.  Trans.  Hoy.  Soc,  Series  A,  vol.  cxcv.,  1900,  p.  1. 
(Deals  with  the  problem  of  measurement  of  intensity  of  association 
from  the  standpoint  of  the  theory  of  variables,  giving  a  method  which 
has  since  been  largely  used  :  only  the  advanced  student  will  be  able  to 
follow  the  work.    For  a  criticism  see  ref.  3. ) 

(5)  Pearson,  Karl,  and  David  Heron,  "On  Theories  of  Association," 

Biometrika,  vol.  ix.,  1913,  pp.  159-332.   (A  reply  to  criticisms  in  ref.  3.) 

(6)  Greenwood,  M.,  and  G.  U.  Yule,  "  The  Statistics  of  Anti-typhoid  and 

Anti-cholera  Inoculations,  and  the  interpretation  of  such  statistics  in 
general,"  Proc.  Roy.  Soc.  of  Medicine,  vol.  viii.,  1915,  p.  113.  (Cited 
for  the  discussion  of  association  coefficients  in  §  4,  and  the  conclusion 
that  none  of  these  coefficients  are  of  much  value  for  comparative  pur- 
poses in  interpreting  statistics  of  the  type  considered. ) 

(7)  Lipps,  G.  F.,  "Die  Bestimmungder  Abhangigkeitzwiscbenden  Merkmalen 

eines  Gegenstamdes,"  Berichte  d.  Tnath.-phys.  Klasse  d.  Tcgl.  sdclisischen 
Oesellschaft  d.  Wissenschaften,  Leipzig,  Feb.  1905.  (Deals  with  the 
general  theory  of  the  dependence  between  two  characters,  however 
classified  ;  the  coefficient  of  association  of  §  13  is  again  suggested  inde- 
pendently.) 

EXERCISES. 

1.  At  the  census  of  England  and  Wales  in  1901  there  were  (to  the  nearest 
1000)  15,729,000  males  and  16,799,000  females;  3497  males  were  returned 
as  deaf-mutes  from  childhood,  and  3072  females. 

State  proportions  exhibiting  the  association  between  deaf-mutism  from 
childhood  and  sex.  How  many  of  each  sex  for  the  same  total  number  would 
have  been  deaf-mutes  if  there  had  been  no  association  ? 

2.  Show,  as  briefly  as  possible,  whether  A  and  B  are  independent,  posi- 
tively associated,  or  negatively  associated  in  each  of  the  following  cases  : — 

(a)  N  =5000  (A)  =2350  {B)  =3100  (^5)  =  1600 
(&)  {A)  =  490  {AB)=  294  (a)  =  570  {aB)=  380 
(c)       {AB)=  256       {aB)=  768       {A^)=    48       (ajS)  =  144 

3.  (Figures  derived  from  Darwin's  Cross-  and  Self-fertilisation  of  Plants, 
cf.  ref.  1,  p.  294.)  The  table  below  gives  the  numbers  of  plants  of  certain 
species  that  were  above  or  below  the  average  height,  stating  separately  those 
that  were  derived  from  cross-fertilised  and  from  self- fertilised  parentage 
Investigate  the  association  between  height  and  cross-fertilisation  of  parentage, 
and  draw  attention  to  any  special  points  you  notice. 


Species. 

Parentage  Cross-fer- 
tilised. Height- 

Parentage  Self-fer- 
tilised. Height- 

Above 
Average. 

Below 
Average. 

Above 
Average. 

Below 
Average. 

Ipomfea  purpurea  .... 

63 

10 

18 

55 

Petunia  violacea  .... 

61 

16 

13 

64 

Reseda  lutea  

25 

7 

11 

,21 

Reseda  odorata  .... 

39 

16 

25 

30 

Lobelia  fulgens  

17 

17 

12 

22 
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4.  (Figures  from  same  source  as  Example  vii.  p.  33,  but  material  differently 
grouped  ;  classes  7  and  8  of  the  memoir  treated  as  **  dark.")  Investigate  the 
association  between  darkness  of  eye-colour  in  father  and  son  from  the  following 
data: — 

Fathers  with  dark  eyes  and  sons  with  dark  eyes  (^-S)  •  50 

„  not- dark  eyes       (-^j8)  .  79 

Fathers  with  not-dark  eyes  and  sons  with  dark  eyes       (o5)  .  89 
,,  not-dark  eyes  (afi)  .  782 

Also  tabulate  for  comparison  the  frequencies  that  would  have  been  observed 
had  there  been  no  heredity,  i.e.  the  values  of  {AB)q,  {^$)o,  etc.  (§  11). 

5.  (Figures  from  same  source  as  above. )  Investigate  the  association  between 
eye  colour  of  husband  and  eye  colour  of  wife  ( " assortative  mating")  from 
the  data  given  below. 

Husbands  with  light  eyes  and  wives  with  light  eyes       {AB)  .  309 

not-light  eyes  {A$)  .  214 

Husbands  with  not-light  eyes  and  wives  with  light  eyes  (oB)  .  132 

not-light  eyes  (a)8)  .  119 

Also  tabulate  for  comparison  the  frequencies  that  would  have  been  observed 
had  there  been  strict  independence  between  eye  colour  of  husband  and  eye 
colour  of  wife,  i.e.  the  values  of  {AB)q,  etc.,  as  in  question  4. 

6.  (Figures  from  the  Census  of  England  and  Wales,  1891,  vol.  iii.  :  the  data 
cannot  be  regarded  as  trustworthy.)  The  figures  given  below  show  the 
number  of  males  in  successive  age  groups,  together  with  the  number  of  the 
blind  {A),  of  the  mentally-deranged  (5),  and  the  blind  mentally-deranged 
{AB).  Trace  the  association  between  blindness  and  mental  derangement 
from  childhood  to  old  age,  tabulating  the  proportions  of  insane  amongst  the 
whole  population  and  amongst  the  blind,  and  also  the  association  coefficient 
Q  of  §  13,    Give  a  short  verbal  statement  of  your  results. 


6- 

15- 

25- 

35- 

45- 

65- 

65- 

75  and 
upwards. 

N 
{A) 
(B) 
iAB) 

3,304,230 
844 
2,820 
17 

2,712,521 
1,184 
6,225 
19 

2,089,010 
1,165 
8,482 
19 

1,611,077 
1,501 
9,214 
31 

1,191,789 
1.752 
8,187 
32 

770,124 
1,905 
6,799 
34 

444,896 
1,932 
3,412 
22 

161,692 
1,701 
1,098 
9 

7.  Show  that  if 

{AB),    {aB),    {A$),  (a)8)i 
{AB)^    {aB)^    {A^)^  (a/3)2 
be  two  aggregates  corresponding  to  the  same  values  of  {A),  {B),  (o),  and  ()8), 
{AB),  -  {AB)^  =  {aB),  -  {aB),  =  {Afi)^  -  {Af}\  =  (afi),  -  {a&)^ 

8.  Show  that  if 

5  =  {AB)-{AB)o 
{ABf  -f  (ai8)2 -  {a.Bf  -  {A^f  =  [{A)-  {a)J{B)  -  (^8)]  f  2N .  5. 

9.  The  existence  of  association  may  be  tested  either  by  comparison  of  pro- 
portions {e.g.  {AB)I{B)  with  (^)3)/(^)),  as  in  §§  9,  10,  or  by  the  value  of  5,  as 
in  §§  11,  12.    Show  that 

xJW)UAB)jm\ 
N    I  {B)      {^)  { 
JA){a)fiAB)_{aB)\ 
IN)    I   (A)       (a)  / 


CHAPTER  IV. 


PARTIAL  ASSOCIATION. 


1-2.  Uncertainty  in  interpretation  of  an  observed  association — 3-5.  Source  of 
the  ambiguity  :  partial  associations — 6-8,  Illusory  association  due 
to  the  association  of  each  of  two  attributes  with  a  third — 9.  Estima- 
tion of  the  partial  associations  from  the  frequencies  of  the  second 
order — 10-12.  The  total  number  of  associations  for  a  given  number 
of  attributes — 13-14.  The  case  of  complete  independence. 


1.  If  we  find  that  in  any  given  case 

all  that  is  known  is  that  there  is  a  relation  of  some  sort  or  kind 
between  A  and  B.  The  result  by  itself  cannot  tell  as  whether 
the  relation  is  direct,  whether  possibly  it  is  only  due  to  "  fluctuations 
of  sampling  "  (cf.  Chap.  III.  §§  7-8),  or  whether  it  is  of  any  other 
particular  kind  that  we  may  happen  to  have  in  our  minds  at  the 
moment.  Any  interpretation  of  the  meaning  of  the  association  is 
necessarily  hypothetical,  and  the  number  of  possible  alternative 
hypotheses  is  in  general  considerable. 

2.  The  commonest  of  all  forms  of  alternative  hypothesis  is  of 
this  kind :  it  is  argued  that  the  relation  between  the  two  attributes 
A  and  B  is  not  direct,  but  due,  in  some  way,  to  the  association  of 
A  with  C  and  of  B  with  C.  An  illustration  or  two  will  make  the 
matter  clearer : — 

(1)  An  association  is  observed  between  "vaccination"  and 
"  exemption  from  attack  by  small-pox,"  i.e.  more  of  the  vaccinated 
than  of  the  unvaccinated  are  exempt  from  attack.  It  is  argued 
that  this  -does  not  imply  a  protective  effect  of  vaccination,  but  is 
wholly  due  to  the  fact  that  most  of  the  unvaccinated  are  drawn  from 
the  lowest  classes,  living  in  very  unhygienic  conditions.  Denoting 
vaccination  by  A,  exemption  from  attack  by  B,  hygienic  conditions  by 
C,  the  argument  is  that  the  observed  association  between  A  and  B 
is  due  to  the  associations  of  both  with  G 
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(2)  It  is  observed,  at  a  general  election,  that  a  greater 
proportion  of  the  candidates  who  spent  more  money  than  their 
opponents  won  their  elections  than  of  those  who  spent  less.  It 
is  argued  that  this  does  not  mean  an  influence  of  expenditure  on 
the  result  of  elections,  but  is  due  to  the  fact  that  Conservative 
principles  generally  carried  the  day,  and  that  the  Conservatives 
generally  spent  more  than  the  Liberals.  Denoting  winning  by  Ay 
spending  more  than  the  opponent  by  and  Conservative  by  C,  the 
argument  is  the  same  as  the  above  (c/.  Question  9  at  the  end  of 
the  chapter). 

(3)  An  association  is  observed  between  the  presence  of  some 
attribute  in  the  father  and  its  presence  in  the  son;  and  also 
between  the  presence  of  the  attribute  in  the  grandfather  and  its 
presence  in  the  grandson.  Denoting  the  presence  of  the  attribute 
in  son,  father,  and  grandfather  by  A,  B,  and  C,  the  question  arises 
whether  the  association  between  A  and  G  may  not  be  due  solely 
to  the  associations  between  A  and  B^  B  and  (7,  respectively. 

3.  The  ambiguity  in  such  cases  evidently  arises  from  the  fact 
that  the  universe  of  observation,  in  each  case,  contains  not 
merely  objects  possessing  the  third  attribute  alone,  or  objects 
not  possessing  it,  but  both. 

If  the  universe  were  restricted  to  either  class  alone  the  given 
ambiguity  would  not  arise,  though  of  course  others  might  remain. 

Thus,  in  the  first  illustration,  if  the  statistics  of  vaccination 
and  attack  were  drawn  from  one  narrow  section  of  the  population 
living  under  approximately  the  same  hygienic  conditions,  and  an 
association  were  still  observed  between  vaccination  and  exemption 
from  attack,  the  supposed  argument  would  be  refuted.  The  fact 
would  prove  that  the  association  between  vaccination  and 
exemption  could  not  be  wholly  due  to  the  association  of  both  with 
hygienic  conditions. 

Again,  in  the  second  illustration,  if  we  confine  our  attention  to 
the  "  universe  "  of  Conservatives  (instead  of  dealing  with  candidates 
of  both  parties  together),  and  compare  the  percentages  of  Conserva- 
tives winning  elections  when  they  spend  more  than  their  opponents 
and  when  they  spend  less,  we  shall  avoid  the  possible  fallacy.  If 
the  percentage  is  greater  in  the  former  case  than  in  the  latter,  it 
cannot  be  for  the  reasons  suggested  in  §  2. 

The  biological  case  of  the  third  illustration  should  be  similarly 
treated.  If  the  association  between  A  and  C  be  observed  for 
those  cases  in  which  all  the  parents,  say,  possess  the  attribute,  or 
else  all  do  not,  and  it  is  still  sensible,  then  the  association  first 
observed  between  A  and  C  for  the  whole  universe  cannot  have 
been  due  solely  to  the  observed  associations  between  A  and  B,  B 
and  G. 
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4.  The  associations  observed  between  the  attributes  A  and  B 
in  the  universe  of  C's  and  the  universe  of  y's  may  be  termed 
partial  associations,  to  distinguish  them  from  the  total  associations 
observed  between  A  and  B  in  the  universe  at  large.  In  terms  of 
the  definition  of  §  5  of  Chap.  TIL,  A  and  B  will  be  said  to  be  posi- 
tively associated  in  the  universe  of  C's  {cf.  §  4  of  Chap.  II.)  when 

(^5C)>(^-f^>  (1) 

and  negatively  associated  in  the  converse  case. 

As  in  the  simpler  case,  the  association  is  most  simply  tested  by 
a  comparison  of  percentages  or  proportions  (§  9,  Chap.  III.), 
although  for  some  purposes  a  "coefficient  of  association"  of 
some  kind  may  be  useful.  Confining  our  attention  to  the  more 
fundamental  method,  if  A  and  B  are  positively  associated  within 
the  universe  of  C's,  we  must  have,  to  quote  only  the  four  most 
convenient  comparisons, 


(A£C)    (AG)  (ABC)  (£C) 

(BC)  ^  (O)     ^  '  (AC)  ^  (C)     ^  ' 

(ABC)    (APC)      .  (ABC)    (oBC)  . 


(2) 


(BC)  "  (HG)  (AC)  -  (aC) 

These  inequalities  may  easily  be  rewritten  for  any  other  case  by 
making  the  proper  substitutions  in  the  symbols ;  thus  to  obtain 
the  inequalities  for  testing  the  association  between  A  and  C  in 
the  universe  of  B%  B  must  be  written  for  C,  /?  for  y,  and  vice 
versa,  throughout;  it  being  remembered  that  the  order  of  the 
letters  in  the  class  symbol  is  immaterial.  The  remarks  of  §  10, 
Chap.  III.,  as  to  the  choice  of  the  comparison  to  be  used,  apply  of 
course  equally  to  the  present  case. 

5.  Though  we  shall  confine  ourselves  in  the  present  work  to 
the  detailed  discussion  of  the  case  of  three  attributes,  it  should  be 
noticed  that  precisely  similar  conceptions  and  formulae  to  the 
above  apply  in  the  general  case  where  more  than  three  attributes 
have  been  noted,  or  where  the  relations  of  more  than  three  have 
to  be  taken  into  account.  If,  when  it  is  observed  that  A  and  B 
are  still  associated  within  the  universe  of  C's,  it  is  argued  that 
this  is  due  to  the  association  of  both  A  and  B  with  D,  the  argu- 
ment may  be  tested  by  still  further  limiting  the  field  of  observa- 
tion to  the  universe  OD.  If 

A  and  B  are  positively  associated  within  the  universe  of  CD's, 
and  the  association  cannot  be  wholly  ascribed  to  the  presence  and 
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absence  of  D  as  suggested,  nor  to  the  presence  and  absence  of 
C  and  D  conjointly.  If  it  be  then  argued  that  the  presence 
and  absence  of  E  is  the  source  of  association,  the  process  may 
be  repeated  as  before,  the  association  of  A  and  B  being  tested 
for  the  universe  CDE^  and  so  on  as  far  as  practicable. 

Partial  associations  thus  form  the  basis  of  discussion  for  any 
case,  however  complicated.  The  two  following  examples  will 
serve  as  illustrations  for  the  case  of  three  attributes. 

Example  i. — (Material  from  ref.  5  of  Chap.  I.) 

The  following  are  the  proportions  per  10,000  of  boys  observed 
with  certain  classes  of  defects,  amongst  a  number  of  school 
children.  {A)  denotes  the  number  with  development  defects,  (jB) 
with  nerve-signs,  {D)  the  number  of  the  "  dull." 


The  Report  from  which  the  figures  are  drawn  concludes  that  *'  the 
connecting  link  between  defects  of  body  and  mental  dulness  is 
the  coincident  defect  of  brain  which  may  be  known  by  observation 
of  abnormal  nerve-signs."    Discuss  this  conclusion. 

The  phrase  "  connecting  link  "  is  a  little  vague,  but  it  may 
mean  that  the  mental  defects  indicated  by  nerve-signs  B  may 
give  rise  to  development-defects  J,  and  also  to  mental-dul- 
ness  D ;  A  and  D  being  thus  common  effects  of  the  same  cause 
B  (or  another  attribute  necessarily  indicated  by  B),  and  not 
directly  influencing  each  other.  The  case  is  thus  similar  to  that 
of  the  first  illustration  of  §  2  (liability  to  small-pox  and  to  non- 
vaccination  being  held  to  be  common  effects  of  the  same  circum- 
stances), and  may  be  similarly  treated  by  investigation  of  the 
partial  associations  between  A  and  D  for  the  universes  B  and 
As  the  ratios  (A)/Ii,  {B)/JV^  small,  comparisons  of  the 

form  (4)  {h)  of  Chap.  III.  (p.  31),  or  (2)  (a)  {h)  above,  may  very 
well  be  used  (cf.  the  remarks  in  §  10  of  the  saine  chapter. 


The  following  figures  illustrate,  then,  the  association  between 
A  and  D  for  the  whole  universe,  the  ^-universe  and  the  p- 
universe : — 

For  the  entire  material : — 


10,000 
877 
1,086 
789 


{AB) 
(AD) 
{BD) 
(ABB) 


338 
338 
455 
153 


p.  31). 


Proportion  of  the  dull  =  {D)/N  . 


789 


—  7*9  per  cent. 


10,000 
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For  those  exhibiting  nerve  signs  : — 
Proportion  of  the  dull  =  {BDji  B) 

,,  defectively  developed  who  \ 

were  dull  =  (^^Z>)/(^^).       .       .  ./ 

For  those  not  exhibiting  nerve  signs  : — 
Proportion  of  the  dull  =  ()8l?)/()8) 

defectively  developed  who  \ 
were  dull  =  (^y8Z>)/(^ ^)  .       .       .  ./ 

The  results  are  extremely  striking ;  the  association  between  A 
and  D  is  very  high  indeed  both  for  the  material  as  a  whole  (the 
universe  at  large)  and  for  those  not  exhibiting  nerve-signs  (the 
^-universe),  but  it  is  veri/  small  for  those  who  do  exhibit  nerve- 
signs  (the  ^-universe). 

This  result  does  not  appear  to  be  in  accord  with  the  conclusion 
of  the  Report,  as  we  have  interpreted  it,  for  the  association 
between  A  and  D  in  the  ;8-uni verse  should  in  that  case  have 
been  very  low  instead  of  very  high. 

Example  ii. — Eye-colour  of  grandparent,  parent  and  child. 
(Material  from  Sir  Francis  Galton's  Natural  Inheritance  (1889), 
table  20,  p.  216.  The  table  only  gives  particulars  for  78  large 
families  with  not  less  than  6  brothers  or  sisters,  so  that  the 
material  is  hardly  entirely  representative,  but  serves  as  a  good 
illustration  of  the  method.)  The  original  data  are  treated  as  in 
Example  vii.  of  the  last  chapter  (p.  33).  Denoting  a  light-eyed 
child  by  A,  parent  by  B,  grandparent  by  C,  every  possible  line  of 
descent  is  taken  into  account.  Thus,  taking  the  following  two 
lines  of  the  table. 

Children  Parents  Grandparents 


A. 

a. 

B. 

3. 

C. 

7 

Light-eyed. 

Not- 
Light-eyed. 

Light-eyed. 

Not- 
Light-eyed. 

Light-eyed. 

Not- 
Light-ejed 

4 

5 

1 

1 

1 

3 

3 

4 

1 

1 

4 

0 

the  first  would  give  4x1x1  =  4  to  the  class  ABC,  4x1x3  =  12  to 
the  class  ABy,  4  to  AfSC,  12  to  Afiy,  5  to  aBC,  15  to  aBy,  5  to 
a/3  (7,  and  15  to  a(Sy ;  the  second  would  give  3x1x4  =  12  to  the 
class  ABC,  12  to  A/SC,  16  to  aBC,  16  to  a/SC,  and  none  to  the  re- 
mainder.  The  class-frequencies  so  derived  from  the  whole  table  are, 


(ABC) 

1928 

(aBC) 

303 

(ABy) 

596 

(aBy) 

225 

(AftC) 

652 

395 

508 

501 

455 
1,086 
153 
338 


41  "9  per  cent. 


=  45-3 


334 
3,914 
185 
539 


=  37 


=  34-3 
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The  following  comparisons  indicate  the  association  between 
grandparents  and  parents,  parents  and  children,  and  grand- 
parents and  grandchildren,  respectively  :  — 

Grandparents  and  Parents. 

Proportion  of  Hght-eyed  amongst  the  \  _Cgg)^-231  ^^^^  ^^^^ 
children  of  light-eyed  grandparents  J     (C)     3178  '  '  ' 


Proportion  of  light-eyed  amongst  the 
children  of  not-light-eyed  grand- 
parents      .       .       .       .  . 


1  (^)_821_^ 
I     ^y)  -1830 


Parents  and  Children. 

Proportion  of  light-eyed  amongst  the  \  _  {AB)  _  2524  _ 

children  of  light-eyed  parents       .  /  ~  \B)     3052        '  ^^"^ 

Proportion  of  light-eyed  amongst  the  \  _(^^)  _l^^_g^.2 
children  of  not-light-eyed  parents .  J      (j8)      1956  " 

In  both  the  above  cases  we  are  really  dealing  with  the 
association  between  parent  and  oft'spring,  and  consequently  the 
intensity  of  association  is,  as  might  be  expected,  approximately 
the  same  ;  in  the  next  case  it  is  naturally  lower : — 

Grandparents  and  Grandchildren. 

Proportion  of  light-eyed  amongst  the^j     {AO  2480 
grandchildren  of  light-eyed  grand-  j-  =  77^  =  ^778  =  78*0  per  cent, 
parents  J  (C) 

Proportion  of  light-eyed  amongst  the^i     (Ay)  1104 

grandchildren    of   not-light-eyed  K  =  -t--y  =7^oa  =  60 "3 
grandparents       .       .       .       .  j  ^"^^ 

We  proceed  now  to  test  the  partial  associations  between  grand- 
parents and  grandchildren,  as  distinct  from  the  total  associations 
given  above,  in  order  to  throw  light  on  the  real  nature  of  the 
resemblance.  There  are  two  such  partial  associations  to  be 
tested  :  (1)  where  the  parents  are  light-eyed,  (2)  where  they  are 
not-light-eyed.    The  following  are  the  comparisons  : — 

Grandparents  and  Grandchildren  :  Parents  light-eyed. 

Proportion  of  light-eyed  amongst  the  )     (ABC)  1928 

grandchildren  of  light-eyed  grand-  >  =  Td7T\  ~22il~^^  ^ 
parents  )  ^-o^^ 

Proportion  of  light-eyed  amongst  the       (ABy)  596 

grandchildren    of    not-light-eyed  ]- =  -T^r-T- =  o^y- =  72  6  „ 
grandparents       .       .       .       .J  ^ 
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Grandparents  and  Grandchildren  :  Parents  not-light-eyed. 

Proportion  of  light-eyed  amongst  the  ^     (ARC)  552 
grandchildren  of  light-eyed  grand-  \  =  .  p,.  —  -rj=-  =  58  "3  per  cent, 
parents  J  ^^^^ 

Proportion  of  light-eyed  amongst  the  "I     (^fl-y)  508 

grandchildren    of    not-light-eyed  |- =-7^-Y  =  ,-xq3  =  50 '3 
grandparents       .       .       .  .J 

In  both  cases  the  partial  association  is  quite  well-marked  and 
positive ;  the  total  association  between  grandparents  and  grand- 
children cannot,  then,  be  due  wholly  to  the  total  associations 
between  grandparents  and  parents,  parents  and  children,  re- 
spectively. There  is  an  ancestral  heredity^  as  it  is  termed,  as 
well  as  a  parental  heredity. 

We  need  not  discuss  the  partial  association  between  children  and 
parents,  as  it  is  comparatively  of  little  consequence.  It  may  be 
noted,  however,  as  regards  the  above  results,  that  the  most 
important  feature  may  be  brought  out  by  stating  three  ratios 
only. 

If  A  and  B  are  positively  associated,  {AB)\{B)>{A)jN. 

If  A  and  G  are  positively  associated  in  the  universe  of  ^'s, 
{ABG)I{BG)  >  (AB)/{B).  Hence  (A)/]^,  {AB)/{B),  and  {ABC)/{BC) 
form  an  ascending  series.    Thus  we  have  from  the  given  data — 

''TMj:„^ntn^ar''  '"!°"'^!}=  =71-6  percent 

Proportion  of  light-eyed  amongst  the\  _  =$0''7 

children  of  light- eyed  parents       ./      ^      J/l  i  —  -  '  »• 
Proportion  of  light-eyed  amongst  tlie 

children  of  light-eyed  parents  and  V  ={ABCr)l{BC)  =  86-A  „ 

grandparents       .  .       .  J 

If  the  great-grandparents,  etc.,  etc.,  were  also  known,  the  series 
might  be  continued,  giving  {ABCD)/(BGD),  {ABCDE)/{BGDE), 
and  so  forth.  The  series  would  probably  ascend  continuously 
though  with  smaller  intervals,  A  and  D  being  positively  associated 
in  the  universe  of  BG's,  A  and  B  in  the  universe  of  BCD's,  etc. 

6.  The  above  examples  will  serve  to  illustrate  the  practical 
application  of  partial  associations  to  concrete  cases.  The  general 
nature  of  the  fallacies  involved  in  interpreting  associations 
between  two  attributes  as  if  they  were  necessarily  due  to  the 
most  obvious  form  of  direct  causation  is  more  clearly  exhibited 
by  the  following  theorem  : — 

If  A  and  B  are  i^idejiendent  within  the  universe  o  f  C's  and  also 
within  the  universe  of  y's,  they  will  nevertheless  be  associated 
within  the  universe  at  large,  unless  C  is  independent  of  either  A 
or  B  or  both. 
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The  two  data  give — 

(AC)(BC) 


{ABC) 


(C) 


(ABy)  =        =  [(^)-(^g)l[(^)-(^g)] 

(7)  (y) 


■  (3) 


Adding  them  together  we  have — 

{AB)  =  -^^S^N{AC){BC^-{A){C){BC)-{B){0){AC^ 
Write,  as  in  §  11  of  Chap.  TIL  (p.  35)— 

iAO^J-^,  iBC^J-X, 

subtract  {AB)q  from  both  sides  of  the  above  equation,  simplify, 
and  we  have 

{AB)-(AB\  =  ^[(AO}-(AC)Mm-(SC)o]  ■  W 

This  proves  the  theorem ;  for  the  right-hand  side  will  not  be 
zero  unless  either  (AC)  =  (AC)q  or  (BC)  =  (BC)^. 

7.  The  result  indicates  that,  while  no  degree  of  heterogeneity 
in  the  universe  can  influence  the  association  between  A  and  B 
if  all  other  attributes  are  independent  of  either  ^  or  ^  or  both, 
an  illusory  or  misleading  association  may  arise  in  any  case  where 
there  exists  in  the  given  universe  a  third  attribute  C  with  which 
both  A  and  B  are  associated  (positively  or  negatively).  If  both 
associations  are  of  the  same  sign,  the  resulting  illusory  association 
between  A  and  B  will  be  positive ;  if  of  opposite  sign,  negative. 
The  three  illustrations  of  §  2  are  all  of  the  first  kind.  In  (1)  it 
is  argued  that  the  positive  associations  between  vaccination  and 
hygienic  conditions,  exemption  from  attack  and  hygienic  conditions^ 
give  rise  to  an  illusory  positive  association  between  vaccination 
and  exemption  from  attack.  In  (2)  it  is  argued  that  the  positive 
associations  between  conservative  and  winning,  conservative  and 
spending  moi'e,  give  rise  to  an  illusory  positive  association  between 
%vinning  and  spending  more.  In  (3)  the  question  is  raised  whether 
the  positive  association  between  grandparent  and  grandchild  may 
not  be  due  solely  to  the  positive  associations  between  grandparent 
and  parent,  parent  and  child. 

Misleading  associations  of  this  kind  may  easily  arise  through 
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the  mingling  of  records,  e.g.  respecting  the  two  sexes,  which  a 
careful  worker  would  keep  distinct. 

Take  the  following  case,  for  example.  Suppose  there  have  been 
200  patients  in  a  hospital,  100  males  and  100  females,  suffering 
from  some  disease.  Suppose,  further,  that  the  death-rate  for  males 
(the  case  mortality)  has  been  30  per  cent.,  for  females  60  per  cent. 
A  new  treatment  is  tried  on  80  per  cent,  of  the  males  and  40  per 
cent,  of  the  females,  and  the  results  published  without  distinction 
of  sex.  The  three  attributes,  with  the  relations  of  which  we  are 
here  concerned,  are  death,  treatment  and  male  sex.  The  data  show 
that  more  males  were  treated  than  females,  and  more  females 
died  than  males ;  therefore  the  first  attribute  is  associated  nega- 
tively, the  second  positively,  with  the  third.  It  follows  that  there 
will  be  an  illusory  negative  association  between  the  first  two — 
death  and  treatment.  If  the  treatment  were  completely  inefficient 
we  would,  in  fact,  have  the  following  results  : — 


Males. 

Females. 

Total. 

Treated  and  died  .       .  • 

24 

24 

48 

and  did  not  die 

56 

16 

72 

Not  treated  and  died 

6 

36 

42 

„           and  did  not  die  . 

14 

24 

38 

i.e.  of  the  treated,  only  48/120  =  40  per  cent,  died,  while  of  those 
not  treated  42/80  =  52*5  per  cent.  died.  If  this  result  were  stated 
without  any  reference  to  the  fact  of  the  mixture  of  the  sexes,  to 
the  different  proportions  of  the  two  that  were  treated  and  to  the 
different  death-rates  under  normal  treatment,  then  some  value  in 
the  new  treatment  would  appear  to  be  suggested.  To  make 
a  fair  return,  either  the  results  for  the  tw^o  sexes  should  be 
stated  separately,  or  the  same  proportion  of  the  two  sexes 
must  receive  the  experimental  treatment.  Further,  care  would 
have  to  be  taken  in  such  a  case  to  see  that  there  was  no 
selection  (perhaps  unconscious)  of  the  less  severe  cases  for  treat- 
ment, thus  introducing  another  source  of  fallacy  {death  positively 
associated  with  severity,  treatment  negatively  associated  with 
severity,  giving  rise  to  illusory  negative  association  between 
treatment  and  death). 

A  misleading  association  between  the  characters  of  parent  and 
offspring  might  similarly  be  created  if  the  records  for  male-male 
and  female-female  lines  of  descent  were  mixed.  Thus  suppose  50 
per  cent,  of  males  and  10  per  cent,  of  females  exhibit  some 
attribute  for  which  there  is  no  association  in  either  line,  then  we 
would  have  for  each  line  and  ,,for  a  mixed  record  of  equal 
numbers — 
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Male  line.       Female  line.    Mixed  record. 

Parents  with  attribute  and 


.rentswith  attribute  and  eent.    1  per  cent.    13  per  cent, 

children  with      .       .  j      ^  ^  ^ 

.rentswith  attribute  and 
children  without  . 

Parents  without  attribute 


Parentswith  attribute  and  \  g 
children  without  .       .  \  ^       "  " 


^  I  25  „ 


and  children  with          ^ 9  17 


Parents  without  attribute 
and  children  without  , 


81      „  53 


Here  13/30  =  43  per  cent,  of  the  offspring  of  parents  with  the 
attribute  possess  the  attribute  themselves,  but  only  17/70  =  24 
per  cent,  of  the  offspring  of  parents  without  the  attribute.  The 
association  between  attribute  in  parent  and  attribute  in  off&jpring 
is,  however,  due  solely  to  the  association  of  both  with  male  sex. 
The  student  will  see  that  if  records  for  male-female  and  female- 
male  lines  were  mixed,  the  illusory  association  would  be  negative, 
and  that  if  all  four  lines  were  combined  there  would  be  no  illusory 
association  at  all. 

8.  Illusory  associations  may  also  arise  in  a  different  way 
through  the  personality  of  the  observer  or  observers.  If  the 
observer's  attention  fluctuates,  he  may  be  more  likely  to  notice 
the  presence  of  A  when  he  notices  the  presence  of  and  vice 
versd ;  in  such  a  case  A  and  B  (so  far  as  the  record  goes)  will  both 
be  associated  with  the  observer's  attention  C,  and  consequently 
an  illusory  association  will  be  created.  Again,  if  the  attributes 
are  not  well  defined,  one  observer  may  be  more  generous  than 
another  in  deciding  when  to  record  the  presence  of  A  and  also 
the  presence  of  B,  and  even  one  observer  may  fluctuate  in  the 
generosity  of  his  marking.  In  this  case  the  recording  of  A  and 
the  recording  of  B  will  both  be  associated  with  the  generosity 
of  the  observer  in  recording  their  presence,  C,  and  an  illusory 
association  between  A  and  B  will  consequently  arise,  as 
before. 

9.  It  is  important  to  notice  that,  though  we  cannot  actually 
determine  the  partial  associations  unless  the  third-order  frequency 
(ABC)  is  given,  we  can  make  some  conjecture  as  to  their  sign 
from  the  values  of  the  second-order  frequencies. 

Suppose,  for  instance,  that — 


■  (5) 
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so  that  Sj  and  Sg  are  positive  or  negative  according  as  A  and  B 
are  positively  or  negatively  associated  in  the  universes  of  C  and 
y  respectively.    Then  we  have  by  addition — 

(.^)=(«,(i^,a,.a,    .    .  (6) 

Hence  if  the  value  of  {AB)  exceed  the  value  given  by  the  first 
two  terms  {i.e.  if  Sj  +  \  be  positive),  A  and  B  must  be  positively 
associated  either  in  the  universe  of  C's,  the  universe  of  y's,  or 
both.  If,  on  the  other  hand,  {AB)  fall  short  of  the  value  given  by 
the  first  two  terms,  A  and  B  must  be  negatively  associated  in 
the  universe  of  C"s,  the  universe  of  y's,  or  both.  Finally,  if 
{AB)  be  equal  to  the  value  of  the  first  two  terms,  A  and  B  must 
be  positively  associated  in  the  one  partial  universe  and  negatively 
in  the  other,  or  else  independent  in  both. 

The  expression  (6)  may  often  be  used  in  the  following  form, 
obtained  by  dividing  through  by,  say,  {B) — 

(,AB)_{AG)    (BC)    (Ay)   (Sy)   S,  +  S, 
(B)  -  (C)  •  (B)  +  (y)  •  {B)+-{B)     ■  ' 

In  using  this  expression  we  make  use  solely  of  proportions  or 
percentages,  and  judge  of  the  sign  of  the  partial  associations 
between  A  and  B  accordingly.  A  concrete  case,  as  in  Example  iii. 
below,  is  perhaps  clearer  than  the  general  formula. 

Example  iii. — (Figures  compiled  from  Supplement  to  the  Fifty- 
fifth  Annual  Report  of  the  Registrar-General  [C. — 8503],  1897.) 
The  following  are  the  death-rates  per  thousand  per  annum,  and  the 
proportions  over  65  years  of  age,  of  occupied  males  in  general, 
farmers,  textile  workers,  and  glass  workers  (over  15  years  of  age 
in  each  case)  during  the  decade  1891-1900  in  England  and  Wales. 

Proportion 
Death-rate  per  thousand 

per  thousand.         over  65  Years 
of  Age. 

Occupied  males  over  15  .15  8  46 

Farmers  „  „  .  .  19*6  132 
Textile  workers,  males  over  15.  15*9  34 
Glass  workers        „         ,,      .    16-6  16 

Would  farming,  textile  working,  and  glass  working  seem  to  be 
relatively  healthy  or  unhealthy  occupations,  given  that  the  death- 
rates  among  occupied  males  from  15-65  and  over  65  years  of  age 
are  11*5  and  102*3  per  thousand  respectively? 

If  A  denote  deaths  B  the  given  occupation,  C  old  age,  we  have 
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to  apply  the  principle  of  equation  (7).  Calculate  what  would  be 
the  death-rate  for  each  occupation  on  the  supposition  that  the 
death-rates  for  occupied  males  in  general  (11"5,  102-3)  apply  to 
each  of  its  separate  age-groups  (under  65,  over  65),  and  see 
whether  the  total  death-rate  so  calculated  exceeds  or  falls  short 
of  the  actual  death-rate.  If  it  exceeds  the  actual  rate,  the 
occupation  must  on  the  whole  be  healthy ;  if  it  falls  short,  un- 
healthy.   Thus  we  have  the  following  calculated  death-rates : — 

Farmers,  .  .  11-5  x '868 -f  102-3  x -132  =  23-5. 
Textile  workers  .  11-5  x  -966 -f  102-3  x  '034  =  14-6. 
Glass  workers  .       .    11-5  x -984  +  102-3  x '016  =  13  0. 

The  calculated  rate  for  farmers  largely  exceeds  the  actual  rate ; 
farming,  then,  must  on  the  whole,  as  one  would  expect,  be 
a  healthy  occupation.  The  death-rate  for  either  young  farmers 
or  old  farmers,  or  both,  must  be  less  than  for  occupied  males  in 
general  (the  last  is  actually  the  case) ;  the  high  death-rate 
observed  is  due  solely  to  the  large  proportion  of  the  aged.  Textile 
working,  on  the  other  hand,  appears  to  be  unhealthy  (14*6  <  15*9), 
and  glass  working  still  more  so  (13-0<16-6) ;  the  actual  low  total 
death-rates  are  due  merely  to  low  proportions  of  the  aged. 

It  is  evident  that  age-distributions  vary  so  largely  from  one 
occupation  to  another  that  total  death-rates  are  liable  to  be  very 
misleading — so  misleading,  in  fact,  that  they  are  not  tabulated  at  all 
by  the  Registrar-General ;  only  death-rates  for  narrow  limits  of  age 
(5  or  10  year  age-classes)  are  worked  out.  Similar  fallacies  are 
liable  to  occur  in  comparisons  of  local  death-rates,  owing  to 
variations  not  only  in  the  relative  proportions  of  the  old,  but  also 
in  the  relative  proportions  of  the  two  sexes. 

It  is  hardly  necessary  to  observe  that  as  age  is  a  variable  quantity, 
the  above  procedure  for  calculating  the  comparative  death-rates 
is  extremely  rough.  The  death-rate  of  those  engaged  in  any  occu- 
pation depends  not  only  on  the  mere  proportions  over  and  under 
65,  but  on  the  relative  numbers  at  every  single  year  of  age.  The 
simpler  procedure  brings  out,  however,  better  than  a  more  complex 
one,  the  nature  of  the  fallacy  involved  in  assuming  that  crude  death- 
rates  are  measures  of  healthiness.    [See  also  Chap.  XI.  §§  17-19.] 

Example  iv. — Eye-colour  in  grandparent,  parent  and  child. 
(The  figures  are  those  of  Example  ii.) 

light-eyed  child  ;  B,  light-eyed  parent ;  C,  light-eyed  grand- 
parent. 

N  =  5008  {AB)  =  2524 

^)  =  3584  (^C)  =  2480 

^)  =  3052  (BC)  =  2231 

(C)  =  3178 
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Given  only  the  above  data,  investigate  whether  there  is  probably 
a  partial  association  between  child  and  grandparent. 
If  there  were  no  partial  association  we  would  have — 

{AB){BC)    {A  mo 

_  2524  x  2231    1060  x  947 

3052  1956 
=  1845-0 +  513-2 
=  2358-2. 

Actually  (>4C)  =  2480;  there  must,  then,  be  partial  association 
either  in  the  ^-universe,  the  ^-universe,  or  both.  In  the  absence 
of  any  reason  to  the  contrary,  it  would  be  natural  to  suppose  there 
is  a  partial  association  in  both ;  i.e.  that  there  is  a  partial 
association  with  the  grandparent  whether  the  line  of  descent 
passes  through  light-eyed  "  or  "  not-light-eyed  "  parents,  but  this 
could  not  be  proved  without  a  knowledge  of  the  class-frequency 
{ABC). 

10.  The  total  possible  number  of  associations  to  be  derived  from 
n  attributes  grows  so  rapidly  with  the  value  of  n  that  the  evalua- 
tion of  them  all  for  any  case  in  which  n  is  greater  than  four 
becomes  almost  unmanageable.  For  three  attributes  there  are  9 
possible  associations — three  totals,  three  partials  in  positive 
universes,  and  three  partials  in  negative  universes.  For  four 
attributes,  the  number  of  possible  associations  rises  to  54, 
for  there  are  6  pairs  to  be  formed  from  four  attributes,  and 
we  can  find  9  associations  for  each  pair  (1  total,  4  partials 
with  the  universe  specified  by  one  attribute,  and  4  partials 
with  the  universe  specified  by  two).  For  five  attributes  the 
student  will  find  that  there  are  no  less  than  270,  and  for  six 
attributes  1215  associations. 

As  suggested  by  Examples  i.  and  ii.  above,  however,  it  is  not 
necessary  in  any  actual  case  to  investigate  all  the  associations 
that  are  theoretically  possible  ;  the  nature  of  the  problem  indicates 
those  that  are  required. 

In  Example  i.,  for  instance,  the  total  and  partial  associations 
between  A  and  D  were  alone  investigated  ;  the  associations  between 
A  and  B^  B  and  D  were  not  essential  for  answering  the  question 
that  was  asked.  In  Example  ii.,  again,  the  three  total  associations 
and  the  partial  association  between  A  and  G  were  worked  out, 
but  the  partial  associations  between  A  and  B^  B  and  C  were 
omitted  as  unnecessary.  Practical  considerations  of  this  kind  will 
always  lessen  the  amount  of  necessary  labour. 
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11.  It  might  appear,  at  first  sight,  that  theoretical  considera- 
tions would  enable  us  to  lessen  it  still  further.  As  we  saw  in 
Chapter  I.,  all  class-frequencies  can  be  expressed  in  terms  of  those 
of  the  positive  classes,  of  which  there  are  2"  in  the  case  of  n 
attributes.  For  given  values  of  the  n-\-  \  frequencies  N,  (A),  (B), 
(C),  ...  of  order  lower  than  the  second,  assigned  values  of  the 
positive  class-frequencies  of  the  second  and  higher  orders  must 
therefore  correspond  to  determinate  values  of  all  the  possible 
associations.  But  the  number  of  these  positive  class-frequencies 
of  the  second  and  higher  orders  is  only  2"  -n+1  ;  therefore  the 
number  of  algebraically/  independent  associations  that  can  be 
derived  from  n  attributes  is  only  2''-7i+l.  For  successive 
values  of  n  this  gives — 


n  2"  -  71  -f  1 

2  1 

3  4 

4  11 

5  26 

6  57 


Hence  if  we  give  data,  in  any  form,  that  determine  four 
associations  in  the  case  of  three  attributes,  eleven  in  the  case  of 
four  attributes,  and  soon,  in  addition  to  iV^and  the  class-frequencies 
of  the  first  order,  we  have  done  all  that  is  theoretically  necessary. 
The  remaining  associations  can  be  deduced. 

12.  Practically,  however,  the  mere  fact  that  they  can  he  deduced 
is  of  little  help  unless  such  deduction  can  be  eflfected  simply, 
indeed  almost  directly,  by  mere  mental  arithmetic  almost,  and 
this  is  not  the  case.  The  relations  that  exist  between  the  ratios 
or  differences,  such  as  {AB)  -  (AB)q,  that  indicate  the  associations 
are,  in  fact,  so  complex  that  an  unknown  association  cannot  be 
determined  from  those  that  are  given  without  more  or  less  lengthy 
work  ;  it  is  not  possible  to  infer  even  its  sign  by  any  simple 
process  of  inspection.  We  have,  for  instance,  from  (5),  by  the 
process  used  in  obtaining  (4)  for  the  special  case  of  §  6 — 

^{ABy)  -  i^:^^p^~^=[{AB)  -  (AB),]  -  ^^[(^0)  -  {ACmBC)  -  {BC\] 

which  gives  us  the  difference  of  (ABy)  from  the  value  it  would 
have  if  A  and  B  were  independent  in  the  imiverse  of  y's  in  terms 
of  the  difference  of  (ABC)  from  the  value  it  would  have  if  A  and 
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B  were  independent  in  the  universe  of  C's,  and  the  corresponding 
differences  for  the  frequencies  {AB),  {AG),  and  {BC).  The  four 
quantities  in  the  brackets  on  the  right  represent,  say,  the  four 
known  associations,  the  bracket  on  the  left  the  unknown  association. 
Clearly,  the  relation  is  not  of  such  a  simple  kind  that  the  term  on 
the  left  can  be,  in  general,  mentally  evaluated.  Hence  in  con- 
sidering the  choice  and  number  of  associations  to  be  actually 
tabulated,  regard  must  be  had  to  practical  considerations  rather 
than  to  theoretical  relations. 

13.  The  particular  case  in  which  all  the  2**  -  n  + 1  given  associa- 
tions are  zero  is  worth  some  special  investigation. 

It  follows,  in  the  first  place,  that  all  other  possible  associations 
must  be  zero,  i.e.  that  a  state  of  complete  independence,  as  we 
may  term  it,  exists.    Suppose,  for  instance,  that  we  are  given — 

(BC)  =  M  (ABC)  =  = 

Then  it  follows  at  once  that  we  have  also — 

{ABO)-—^^  (I)—' 

i.e.  A  and  C  are  independent  in  the  universe  of  ^'s,  and  B  and  C 
in  the  universe  of  ^'s.  Again, 

(ABy).(AB)-iABC)J^-WC) 

(A){B){y)  {Ay)(By) 
(>)  ■ 

Therefore  A  and  B  are  independent  in  the  universe  of  y's. 
Similarly,  it  may  be  shown  that  A  and  C  are  independent  in  the 
universe  of  ^'s,  B  and  G  in  the  universe  of  a's. 

In  the  next  place  it  is  evident  from  the  above  that  relations  of 
the  general  form  (to  write  the  equation  symmetrically) 


(ABC)  JA)    (B)  (C) 

AT  AT   *     AT   •     AT-  '  *  '  V^i 


N  N   '    N   '  N 

must  hold  for  every  class-frequency.    This  relation  is  the  general 
form  of  the  equation  of  independence,  (2)  (rf).  Chap.  III.  (p.  26). 
14.  It  must  be  noted,  however,  that  (8)  is  not  a  critci'ion  for  the 
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complete  independence  of  A,  H,  and  C  in  the  sense  that  the 
equation 

(AB)JA)  (J) 

is  a  criterion  for  the  complete  independence  of  A  and  B.  If  we 
are  given  iV,  (A),  and  (B),  and  the  last  relation  quoted  holds 
good,  we  know  that  similar  relations  must  hold  for  (A/S),  (aB), 
and  (afS).  If  iV^,  (A),  (B),  and  (C)  be  given,  however,  and  the 
equation  (8)  hold  good,  we  can  draw  no  conclusion  without 
further  information ;  the  data  are  insufficient.  There  are  eight 
algebraically  independent  class-frequencies  in  the  case  of  three 
attributes,  while  ^,  (A),  (B),  (C)  are  only  four:  the  equation  (8) 
must  therefore  be  shown  to  hold  good  for  fotir  frequencies  of  the 
third  order  before  the  conclusion  can  be  drawn  that  it  holds  good 
for  the  remainder,  i.e.  that  a  state  of  complete  independence 
subsists.  The  direct  verification  of  this  result  is  left  for  the 
student. 

Quite  generally,  if  iV,  (A),  (B),  (C),  ....  be  given,  the  relation 

(ABC  .  .  .  .)  _  (A)    (B)  {C± 

N  N  '  N  '  N  '  '  '  '      '       '    ^  ' 

must  be  shown  to  hold  good  for  2"  -n+l  of  the  nth  order  classes 
before  it  may  be  assumed  to  hold  good  for  the  remainder.  It  is 
only  because 

2"  -  n+1  =  1 

when  71  =  2  that  the  relation 

(AB)JA)  (B) 
N      N  ' 

may  be  treated  as  a  criterion  for  the  independence  of  A  and  B. 
If  all  the  n  {n>2)  attributes  are  completely  independent,  the 
relation  (9)  holds  good ;  but  it  does  not  follow  that  if  the  relation 
(9)  hold  good  they  are  all  independent. 
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EXERCISES. 

1.  Take  the  following  figures  for  girls  corresponding  to  those  for  boys  in 
Example  i.,  p.  45,  and  discuss  them  similarly,  but  not  necessarily  using 
exactly  the  same  comparisons,  to  see  whether  the  conclusion  that  "  the 
connecting  link  between  defects  of  body  and  mental  dulness  is  the  coincident 
defect  of  brain  which  may  be  known  by  observation  of  abnormal  nerve  signs  " 
seems  to  hold  good. 

A,  development  defects.    JB,  nerve  signs.    D,  mental  dulness 


N 

10,000 

(AB) 

248 

(A) 

682 

(AD) 

307 

(^) 

850 

{BD) 

363 

(^) 

689 

{ABD) 

128 

2.  (Material  from  Census  of  England  and  Wales,  1891,  vol.  iii.)  The 
following  figures  give  the  numbers  of  those  suffering  from  single  or  combined 
infirmities  :  (1)  for  all  males,  (2)  for  males  of  55  years  of  age  and  over. 

A^  Blindness.    B,  Mental  derangement.    C,  Deaf-mutism. 

(1)  (2)  (1)  (2) 

All  Males.  Males  55-  All  Males.  Males  55- 

N       14,053,000  1,377,000  (AB)         183  65 

{A)             12,281  5,538  {AC)  51  14 

{B)            45,392  10,309  (BC)         299  47 

(C)              7,707  746  {ABC)        11  3 

Tabulate  proportions  per  thousand,  exhibiting  the  total  association  between 
blindness  and  mental  derangement,  and  the  partial  association  between  the 
same  two  infirmities  among  deaf-mutes,  (1)  for  males  in  general,  (2)  for  those 
of  55  years  of  age  or  over.  Give  a  short  verbal  statement  of  the  results,  and 
contrast  them  with  those  of  Question  1. 

3.  (Material  from  supplement  to  55th  Annual  Report  Reg. -Genl.) 

The  death-rate  from  cancer  for  occupied  males  in  general  (over  15)  is 
0'685  per  thousand  per  annum,  and  for  farmers  1'20. 

The  death-rates  from  cancer  for  occupied  males  under  and  over  45  respec- 
tively are  0'13  and  2*25  respectively.  Of  the  farmers  46*1  percent,  are  over 
45. 

Would  you  say  that  farmers  were  peculiarly  liable  to  cancer  ? 

4.  A  population  of  males  over  15  years  of  age  consists  of  7  per  cent,  over  65 
years  of  age  and  93  per  cent,  under.  The  death-rates  are  12  i)er  thousand  per 
annum  in  the  younger  class  and  110  in  the  older,  or  18 '86  in  the  whole 
population.  The  death-rate  of  males  (over  15)  engaged  in  a  certain  industry 
is  26 '7  per  thousand. 

If  the  industry  be  not  unhealthy,  what  must  be  the  approximate  proportion 
of  those  over  65  engaged  in  it  (neglecting  minor  dillerences  of  age 
distribution)  ? 

5.  Show  that  if  A  and  B  are  independent,  while  A  and  C,  B  and  C  are 
associated,  A  and  B  must  be  disassociated  either  in  the  universe  of  C&, 
the  universe  of  7's,  or  both. 

6.  As  an  illustration  of  Question  5,  show  that  if  the  following  were  actual 
data,  there  would  be  a  slight  disassociation  between  the  eye-colours  of 
husband  and  wife  (father  and  mother)  for  the  parents  either  of  light-eyed 
sons  or  not-light-eyed  sons,  or  both,  although  there  is  a  slight  positive 
association  foi  parents  at  large. 
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A  light-eye  colour  in  husband,  B  in  wife,  C  in  son- 


N 
iC) 


1000 
622 
558 
617 


{AB) 
{AC) 
{BC) 


358 
471 
419 


7.  Show  that  i(  {ABC)  =  (apy),  {aBC)  =  [A&y),  and  so  on  (the  case  of 
"complete  equality  of  contrary  frequencies"  of  Question  7,  Chap.  I.),  A,  B, 
and  C  are  completely  independent  if  A  and  B^  A  and  C,  B  and  G  are  inde- 
pendent pair  and  pair. 

8.  If,  in  the  same  case  of  complete  equality  of  contraries, 

{AB)-N\\  =  h^ 
{AC)-Nli  =  ^., 
{BC)-Nl4  =  ds 


show  that 


(AyiiBy) 
(7) 


8,- 


N 


so  that  the  partial  associations  between  A  and  B  in  the  universes  C'and  y  are 
positive  or  negative  according. as 


9.  In  the  simple  contests  of  a  general  election  (contests  in  which  one 
Conservative  opposed  one  Liberal  and  there  were  no  other  candidates)  66  per 
cent,  of  the  winning  candidates  (according  to  the  returns)  s])ent  more  money 
than  their  oj)ponents.  Given  that  63  per  cent,  of  the  winners  were  Con- 
servatives, and  that  the  Conservative  expenditure  exceeded  the  Liberal  in  80 
per  cent,  of  the  contests,  find  the  percentages  of  elections  won  by  Conservatives 
(1)  when  they  spent  more  and  (2)  when  they  spent  less  than  their  opponents, 
and  hence  say  whether  you  consider  the  above  figures  evidence  of  the  influence 
of  expenditure  on  election  results  or  no.  {Note  that  if  the  one  candidate  in  a 
contest  be  a  Conservative-ioinner-who  spends  more  than  his  opponent — the 
other  must  necessarily  be  a  Liberal- loser-tvho  spends  less  —  and  so  forth. 
Hence  the  case  is  one  of  complete  equality  of  contraries.) 

10.  Given  i\i^t{A)lN={B)jN={C)jN=x,  and  that  {AByN={AC)/N=y, 
find  the  major  and  minor  limits  to  y  that  enable  one  to  infer  positive  associa- 
tion between  5  and  C,  i.e.  {BC)/N>x^. 

Draw  a  diagram  on  squared  paper  to  illustrate  your  answer,  taking  x  and  y 
as  co-ordinates,  and  shading  the  limits  within  which  y  must  lie  in  order  to 
permit  of  the  above  inference.  Point  out  the  jjeculiarities  in  the  case  of  in- 
ferring a  positive  association  from  two  negative  associations. 

11.  Discuss  similarly  the  more  complex  case  {A)IN=x,  {B)/N=2x,  {C)/N= 
3a;:— 

(1)  for  inferring  positive  association  between     and  C  given  {AB)/N= 

{AOIN=y. 

(2)  for  inferring  positive  association  between  A  and  C  given  {AB)/N= 

{BG)/N=y. 

(3)  for  inf(!rring  ])ositive  association  between  A  and  B  given  {AC)/N= 

{BC)/N=y. 


CHAPTER  V. 


MANIFOLD  CLASSIFICATION. 

1.  The  general  principle  of  a  manifold  classification — 2-4.  The  table  of 
double-entry  or  contingency  table  and  its  treatment  by  fundamental 
methods — 5-8.  The  coefficient  of  contingency— 9-10.  Analysis  of 
a  contingency  table  by  tetrads— 11-13.  Isotropic  and  anisotropic 
distributions — 14-15.  Homogeneity  of  the  classifications  dealt  with 
in  this  and  the  preceding  chapters  :  heterogeneous  classifications. 

1.  Classification  by  dichotomy  is,  as  was  briefly  pointed  out  in 
Chap.  I.  §  5,  a  simpler  form  of  classification  than  usually  occurs 
in  the  tabulation  of  practical  statistics.  It  may  be  regarded  as 
a  special  case  of  a  more  general  form  in  which  the  individuals  or 
objects  observed  are  first  divided  under,  say,  s  heads,  .  .  .  . 

As,  each  of  the  classes  so  obtained  then  subdivided  under  t  heads, 
jSj,  ^2  •  •  •  •       ^^^^      these  under  u  heads,  Cj,      .  •  •  .  C^,  and 

so  on,  thus  giving  rise  to  s.  t.  u  ultimate  classes  altogether. 

2.  The  general  theory  of  such  a  manifold  as  distinct  from  a 
twofold  or  dichotomous  classification,  in  the  case  of  n  attributes 
or  characters  AJBC  .  .  .  .  iV^,  would  be  extremely  complex :  in  the 
present  chapter  the  discussion  will  be  confined  to  the  case  of  two 
characters,  A  and  B,  only.  If  the  classification  of  the  ^'s  be  s- 
fold  and  of  the  ^'s  ^-fold,  the  frequencies  of  the  st  classes  of  the 
second  order  may  be  most  simply  given  by  forming  a  table  with 
s  columns  headed  to  Ag,  and  t  rows  headed  to  B,.  The 
number  of  the  objects  or  individuals  possessing  any  combination 
of  the  two  characters,  say  A^  and  B„y  i.e.  the  frequency  of  the 
class  AfnB„,  is  entered  in  the  compartment  common  to  the  77?  th 
column  and  the  nth.  row,  the  st  compartments  thus  giving  all 
the  second-order  frequencies.  The  totals  at  the  ends  of  rows 
and  the  feet  of  columns  -give  the  first-order  frequencies,  i.e.  the 
numbers  of  ^^'s  and  ^„'s,  and  finally  the  grand  total  at  the 
right-hand  bottom  corner  gives  the  whole  number  of  observations. 
Tables  I.  and  II.  below  will  serve  as  illustrations  of  such  tables 
of  double-entry  or  contingency  tables,  as  they  have  been  termed 
by  Professor  Pearson  (ref.  1). 
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3.  In  Table  I.  the  division  is  3  x  3-fold :  the  houses  in  England 
and  Wales  are  divided  into  those  which  are  in  (1)  London,  (2) 
other  urban  districts,  (3)  rural  districts,  and  the  houses  in  each 
of  these  divisions  are  again  classified  into  (1)  inhabited  houses, 
(2)  uninhabited  but  completed  houses,  (3)  houses  that  are 
"building,"  i.e.  in  course  of  erection.  Thus  from  the  first  row 
we  see  that  there  were  in  London,  in  round  numbers,  616,000 
houses,  of  which  571,000  were  inhabited,  40,000  uninhabited, 
and  5000  in  course  of  erection :  from  the  first  column,  there 
were  6,260,000  inhabited  houses  in  England  and  Wales,  of  which 
571,000  were  in  London,  4,064,000  in  other  urban  districts,  and 
1,625,000  in  rural  districts. 


Table  I. — Houses  in  England  and  Wales.    (Censiis  of  1901. 
Summary  Table  X.)    (OOO's  omitted.) 


Inhabited. 

Unin- 
habited. 

Building. 

Total. 

Adm.  County  of  London 
Other  urban  districts 
Rural  districts 

Total  for  England  and  Wales 

571 
4064 
1625 

40 

285 
124 

5 
45 
12 

616 
4394 
1761 

6260 

449 

62 

6771 

In  Table  IL,  on  the  other  hand,  the  classification  is  3  x  4-fold  : 
the  eye-colours  are  classed  under  the  three  heads  "  blue,"  "  grey  or 
green,"  and  "brown,"  while  the  hair-colours  are  classed  under 
four  heads,  "fair,"  "brown,"  "black,"  and  "red."    The  table  is 


Table  II. — Hair-  and  Eye-Colours  of  6800  Males  in  Baden. 
{Ammon,  Zur  Anthropologie  der  Badener. ) 


Hair-colour. 

Eye- colour. 

Total. 

Fair. 

Brown. 

Black. 

Red. 

Blue  .... 

1768 

807 

189 

47 

2811 

Grey  or  Green 

946 

1387 

746 

53 

3132 

Brown  .... 

115 

438 

288 

16 

857 

Total 

2829 

2632 

1223 

116 

6800 
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read  similarly  to  the  last.  Taking  the  first  row,  it  tells  us  that 
there  were  2811  men  with  blue  eyes  noted,  of  whom  1768  had 
fair  hair,  807  brown  hair,  189  black  hair,  and  47  red  hair. 
Similarly,  from  the  first  column,  there  were  2829  men  with  fair 
hair,  of  whom  1768  had  blue  eyes,  946  grey  or  green  eyes,  and 
115  brown  eyes.  The  tables  are  a  generalised  form  of  the  four- 
fold (2  X  2-fold)  tables  in  §  13,  Chap.  III. 

4.  For  the  purpose  of  discussing  the  nature  of  the  relation 
between  the  ^'s  and  the  ^'s,  any  such  table  may  be  treated  on 
the  principles  of  the  preceding  chapters  by  reducing  it  in  different 
ways  to  2  X  2-fold  form.  It  then  becomes  possible  to  trace  the 
association  between  any  one  or  more  of  the  ^'s  and  any  one  or 
more  of  the  ^'s,  either  in  the  universe  at  large  or  in  universes 
limited  by  the  omission  of  one  or  more  of  the  ^'s,  of  the  B's,  or 
of  both.  Taking  Table  I.,  for  example,  trace  the  association 
between  the  erection  of  houses  and  the  urban  character  of  a 
district.  Adding  together  the  first  two  rows — i.e.  pooling  London 
and  the  other  urban  districts  together — and  similarly  adding  the 
first  two  columns,  so  as  to  make  no  distinction  between  inhabited 
and  uninhabited  houses  as  long  as  they  are  completed,  we  find — 

Proportion  of  all  houses  which  j 

are  in  course  of  erection  in  >  50/5010 

urban  districts  .        .        .  ) 
Proportion  of  all  houses  which  \ 

are  in  course  of  erection  in  V  12/1761 

rural  districts    .       .       .  ) 

There  is  therefore,  as  might  be  expected,  a  distinct  positive 
association,  a  larger  proportion  of  houses  being  in  course  of 
erection  in  urban  than  in  rural  districts. 

If,  as  another  illustration,  it  be  desired  to  trace  the  association 
between  the  "  uninhabitedness  "  of  houses  and  the  urban  character 
of  the  district,  the  procedure  will  be  rather  different.  Rows  1 
and  2  may  be  added  together  as  before,  but  column  3  may  be 
omitted  altogether,  as  the  houses  which  are  only  in  course  of 
erection  do  not  enter  into  the  question.    We  then  have — 

Proportion  of  all  houses  which  j 

are  uninhabited  in  urban  >  325/4960  =  66  per  thousand, 
districts    .       .       .       .  ) 
Proportion  of  all  houses  which  \ 

are   uninhabited    in    rural  >  124/1749  =  71  „ 
districts     .       .       .       .  ) 
The  association  is  therefore  negative,  the  proportion  of  houses 
uninhabited  being  greater  in  rural  than  in  urban  districts. 


=  10  per  thousand. 
=  7 
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The  eye-  and  hair-colour  data  of  Table  II.  may  be  treated  in  a 
precisely  similar  fashion.  If,  e.g.^  we  desire  to  trace  the  associa- 
tion between  a  lack  of  pigmentation  in  eyes  and  in  hair,  rows  1 
and  2  may  be  pooled  together  as  representing  the  least  pigmenta- 
tion of  the  eyes,  and  columns  2,  3,  and  4  may  be  pooled  together 
as  representing  hair  with  a  more  or  less  marked  degree  of 
pigmentation.    We  then  have — 


Proportion  of  light-eyed  with  I  2714/5943^46  percent. 

tmv  hair  \  '  ^ 


The  association  is  therefore  well-marked.  For  comparison  we 
may  trace  the  corresponding  association  between  the  most  marked 
degree  of  pigmentation  in  eyes  and  hair,  i.e.  brown  eyes  and 
black  hair.  Here  we  must  add  together  rows  1  and  2  as  before, 
and  columns  1,  2,  and  4 — the  column  for  red  being  really  mis- 
placed, as  red  represents  a  comparatively  slight  degree  of  pigmenta- 
tion.   The  figures  are — 


The  association  is  again  positive  and  well-marked,  but  the 
difference  between  the  two  percentages  is  rather  less  than  in  the 
last  case. 

5.  The  mode  of  treatment  adopted  in  the  preceding  section  rests 
on  first  principles,  and,  if  fully  carried  out,  it  gives  the  most  detailed 
information  possible  with  regard  to  the  relations  of  the  two  attri- 
butes. At  the  same  time  a  distinct  need  is  felt  in  practical  work  for 
some  more  summary  method — a  method  which  will  enable  a  single 
and  definite  answer  to  be  given  to  such  a  question  as — Are  the 
-4's  on  the  whole  distinctly  dependent  on  the  ^'s;  and  if  so,  is  this 
dependence  very  close,  or  the  reverse?  The  subject  of  coefficients 
of  association,  which  affords  the  answer  to  this  question  in  the 
case  of  a  dichotomous  classification,  was  only  dealt  with  briefly 
and  incidentally,  for  it  is  still  the  subject  of  some  controversy : 
further,  where  there  are  only  four  classes  of  the  second  order 
to  be  considered  the  matter  is  not  nearly  so  complex  as  where 
the  number  is,  say,  twenty -five  or  more,  and  the  need  for 
any  summary  coefficient  is  not  so  often  nor  so  keenly  felt.  The 
ideas  on  which  Professor  Pearson's  general  measure  of  de- 
pendence, the  "coefiBcient  of  contingency,"  is  based,  are,  more- 
over, quite  simple  and  fundamental,  and  the  mode  of  calculation 


fair  hair  .... 
Proportion  of  brown-eyed  with 
fair  hair  .... 


115/857  =  13 


Proportion  of  brown-eyed  with 
black  hair  .... 

Proportion  of  light-eyed  with 
black  hair  .... 


288/857  =  34  per  cent. 
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is  therefore  given  in  full  in  the  following  section.  The  advanced 
student  should  refer  to  the  original  memoir  (ref.  1)  for  a  completer 
treatment  of  the  theory  of  the  coefficient,  and  of  its  relation  to 
the  theory  of  variables. 

6.  Generalising  slightly  the  notation  of  the  preceding  chapters, 
let  the  frequency  of  AJs  be  denoted  by  (A^),  the  frequency  of 
i?„'s  by  {Bn)i  and  the  frequency  of  objects  or  individuals  possessing 
both  characters  by  (A^B^).  Then,  if  the  ^'s  and  B'b  be  com- 
pletely independent  in  the  universe  at  large,  we  must  have  for  all 
values  of  m  and  n — 

(4„^„)  =  (^-|.:?=>=(^„5„),    .       .       .  (1) 

If,  however,  A  and  B  are  not  completely  independent,  {A,^B^  and 
{k^B^^  will  not  be  identical  for  all  values  of  m  and  n.  Let 
the  difference  be  given  by 

K„  =  {Ar,B:)-{A^B:),       .       .       .  (2) 

A  coefficient  such  as  we  are  seeking  may  evidently  be  based  in 
some  way  on  these  values  of  S.  It  will  not  do,  however,  simply  to 
add  them  together,  for  the  sum  of  all  the  values  of  S,  some  of 
which  are  negative  and  others  positive,  must  be  zero  in  any  case, 
the  sum  of  both  the  (^^)'s  and  the  {AB)qS,  being  equal  to  the 
whole  number  of  observations  N.  It  is  necessary,  therefore,  to 
get  rid  of  the  signs,  and  this  may  be  done  in  two  simple  ways  :  (1) 
by  neglecting  them  and  forming  the  arithmetical  instead  of  the 
algebraical  sum  of  the  differences  S,  or  (2)  by  squaring  the  differ- 
ences and  then  summing  the  squares.  The  first  process  is  the 
shorter,  but  the  second  the  better,  as  it  leads  to  a  coefficient 
easily  treated  by  algebraical  methods,  which  the  first  process 
does  not :  as  the  student  will  see  later,  squaring  is  very 
usefully  and  very  frequently  employed  for  the  purpose  of  elimin- 
ating algebraical  signs.  Suppose,  then,  that  every  8  is  calculated, 
and  also  the  ratio  of  its  square  to  the  corresponding  value  of 
{AB)q^  and  that  the  sum  of  all  such  ratios  is,  say,  ;  or,  in 
symbols,  using  2  to  denote  "  the  sum  of  all  quantities  like  " : — 

•   •  .   •  (3) 

Being  the  sum  of  a  series  of  squares,  is  necessarily  positive, 
and  if  A  and  B  be  independent  it  is  zero,  because  every  8  is  zero. 
If,  then,  we  form  a  coefficient  C  given  by  the  relation 
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this  coefficient  is  zero  if  the  characters  A  and  B  are  completely 
independent,  and  approaches  more  and  more  nearly  towards 
unity  as  increases.  In  general,  no  sign  should  be  attached 
to  the  root,  for  the  coefficient  simply  shows  whether  the  two 
characters  are  or  are  not  independent,  and  nothing  more,  but  in 
some  cases  a  conventional  sign  may  be  used.  Thus  in  Table  II. 
slight  pigmentation  of  eyes  and  of  hair  appear  to  go  together, 
and  the  contingency  may  be  regarded  as  definitely  positive.  If 
slight  pigmentation  of  eyes  had  been  associated  with  marked 
pigmentation  of  hair,  the  contingency  might  have  been  regarded 
as  negative.  C  is  Professor  Pearson's  mean  square  contingency 
coeflBcient.! 

7.  The  coefficient,  in  the  simple  form  (4),  has  one  disadvantage, 
viz.  that  coefficients  calculated  on  different  systems  of  classi- 
fication are  not  comparable  with  each  other.  It  is  clearly  desir- 
able for  practical  purposes  that  two  coefficients  calculated  from 
the  same  data  classified  in  two  different  ways  should  be,  at  least 
approximately,  identical.  With  the  present  coefficient  this  is  not 
the  case:  if  certain  data  be  classified  in,  say,  (1)  6  x  6-fold,  (2) 
3  X  3-fold  form,  the  coefficient  in  the  latter  form  tends  to  be  the 
least.  The  greatest  possible  value  of  the  coefficient  is,  in  fact, 
only  unity  if  the  number  of  classes  be  infinitely  great ;  for  any 
finite  number  of  classes  the  limiting  value  of  C  is  the  smaller  the 
smaller  the  number  of  classes.  This  may  be  briefly  illustrated  as 
follows.  Replacing  8^„  in  equation  (3)  by  its  value  in  terms  of 
i^m-Bn)  and  (^™^„)o  we  have— 


and  therefore,  denoting  the  summation  by  S, 

....  (6) 

Now  suppose  we  have  to  deal  with  a.  tx  t-io\d  classification  in 
which  (A^)  =  (B,n)  fo^"  all  values  of  m ;  and  suppose,  further,  that 
the  association  between  and  B^^  is  perfect,  so  that  {A„^BJ)  = 
(A^)  =  {B^)  for  all  values  of  m,  the  remaining  frequencies  of  the 
second  order  being  zero ;  all  the  frequency  is  then  concentrated 
in  the  diagonal  compartments  of  the  table,  and  each  contributes 

*  Professor  Pearson  (ref.  1)  terms  Sa  sub-contingency  ;  the  square  contin- 
gency ;  the  ratio  x^/^^>  which  he  denotes  by  (p^,  the  mean  square  contingency  ; 
and  the  sum  of  all  the  5's  of  one  sign  only,  on  which  a  different  coeflBcient  can 
be  based,  the  mean  contingency. 

5 
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iV  to  the  sum  S.  The  total  value  of  S  is  accordingly  tJV,  and  the 
value  of  C — 

This  is  the  greatest  possible  value  of  C  for  a  symmetrical  t  x  ^-fold 
classification,  and  therefore,  in  such  a  table,  for — 


t  = 

2 

C  cannot  exceed  0-707 

t== 

3 

)) 

0-816 

t  = 

4 

>» 

5) 

0-866 

t  = 

5 

n 

JJ 

0-894 

t  = 

6 

)> 

?) 

0-913 

t  = 

7 

5J 

>5 

0-926 

t=^ 

8 

J) 

>> 

0-935 

t  = 

9 

J> 

5) 

0-943 

t  = 

10 

J> 

>) 

0-949 

It  is  as  well,  therefore,  to  restrict  the  use  of  the  "  coefficient  of 
contingency  "  to  5  x  5-fold  or  finer  classifications.  At  the  same 
time  the  classification  must  not  be  made  too  fine,  or  else  the  value 
of  the  coefficient  is  largely  affected  by  casual  irregularities  of  no 
physical  significance  in  the  class-frequencies  (cf.  the  remarks  in 
Chap.  III.  7-8). 


Table  III. — Independence-Values  of  the  Frequencies  for  Table  IT. 


Eye  colour. 

Fair. 

Brown. 

Black. 

Red. 

Blue  

Grey  or  Green  .       ,       .       .  . 

1169 
1303 
357 

1088 
1-212 
332 

506 
563 
154 

48-0 
53-4 
14-6 

8.  As  the  classification  of  Table  II.  is  only  3  x  4-fold,  it  is  rather 
crude  for  the  purpose  of  calculating  the  coefficient,  but  will  serve 
simply  as  an  illustration  of  the  form  of  the  arithmetic.  In  Table 
III.  are  given  the  values  of  the  independence  frequencies,  2829  x 
2811/6800  =  1169  and  so  on.  The  value  of  is  more  readily 
calculated  from  equation  (5)  than  from  (3) : — 
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(1768)2/1169 

(946)2/1303 
(115)2/357 
(807)2/1088 
(1387)2/1212 
(438)2/332 
(189)2/506 
(746)2/563 
(288)2/154 
(47)2/48-0 
(53)2/53-4 
(16)2/14-6 


2673-9 
686-8 
37-0 
598-6 
1587-3 
677-8 
70-6 
988-5 
538-6 
46-0 
52-6 
17-5 


Total  =  ^  = 


7875-2 
6800 


1075-2 


1075-2 
7875-2 


=  7-1365  =  0-37 


The  squares  in  such  work  may  conveniently  be  taken  from 
Barlow's  Tables  of  Squares,  Cubes,  etc.  (see  list  of  tables  on 
p.  356),  or  logarithms  may  be  used  throughout — five  figure 
logarithms  are  quite  sufficient. 

9.  While  such  a  coefficient  of  contingency,  in  some  form  or 
other,  is  a  great  convenience  in  many  fields  of  work,  its  use 
should  not  lead  to  a  neglect  of  those  details  which  a  treatment  by 
the  elementary  methods  of  §  4  would  have  revealed.  Whether 
the  coefficient  be  calculated  or  no,  every  table  should  always  be 
examined  with  care  to  see  if  it  exhibit  any  apparently  significant 
peculiarities  in  the  distribution  of  frequency,  e.g.  in  the  associa- 
tions subsisting  between  A,^  and  B„  in  limited  universes.  A  good 
deal  of  caution  must  be  used  in  order  not  to  be  misled  by  casual 
irregularities  due  to  paucity  of  observations  in  some  compartments 
of  the  table,  but  important  points  that  would  otherwise  be  over- 
looked will  often  be  revealed  by  such  a  detailed  examination. 

10.  Suppose,  for  example,  that  any  four  adjacent  frequencies, 
say- 


are  extracted  from  the  general  contingency  table.  Considering 
these  as  a  table  exhibiting  the  association  between  A^^  and  ^„  in 
a  universe  limited  to  A„,A^^i  BnB^+i  alone,  the  association  is 
positive,  negative,  or  zero  according  as  (AmJBn)l{^m+i^n)  is  greater 
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than,  less  than,  or  equal  to  the  ratio  (^m-^n+O/C-^m+i-^n+i)-  The 
whole  of  the  contingency  table  can  be  analysed  into  a  series  of 
elementary  groups  of  four  frequencies  like  the  above,  each  one 
overlapping  its  neighbours  so  that  an  rs-fold  table  contains 
{r-  1)  (s  -  1)  such  "tetrads,"  and  the  associations  in  them  all  can 
be  very  quickly  determined  by  simply  tabulating  the  ratios  like 
{^m^n)l(kn+iSn),  (^m^n+i)/(^m+i^n+i),  etc,  or  pcrhaps  better, 
the  proportions  (.4^^„)/{(A„i^„)  +  (^„i+i^„)},  etc.,  for  every  pair 
of  columns  or  of  rows,  as  may  be  most  convenient.  Taking  the 
figures  of  Table  II.  as  an  illustration,  and  working  from  the 
rows,  the  proportions  run  as  follows  : — 

For  rows  1  and  2.  For  rows  2  and  3. 

1768/2714       0-651  946/1061  0-892 

807/2194       0-368  1387/1825  0-760 

189/935        0-202  746/1034  0-721 

47/100        0-470  53/69  0-768 

In  both  cases  the  first  three  ratios  form  descending  series,  but 
the  fourth  ratio  is  greater  than  the  second.  The  signs  of  the 
associations  in  the  six  tetrads  are  accordingly — 

+  +  - 

+  +  - 

The  negative  sign  in  the  two  tetrads  on  the  right  is  striking, 
the  more  so  as  other  tables  for  hair-  and  eye-colour,  arranged  in 
the  same  way,  exhibit  just  the  same  characteristic.  But  the 
peculiarity  will  be  removed  at  once  if  the  fourth  column  be  placed 
immediately  after  the  first :  if  this  be  done,  i.e.  if  "  red  "  be  placed 
between  "fair"  and  "brown  "  instead  of  at  the  end  of  the  colour- 
series,  the  sign  of  the  association  in  all  the  elementary  tetrads 
will  be  the  same.  The  colours  will  then  run  fair,  red,  brown, 
black,  and  this  would  seem  to  be  the  more  natural  order,  consider- 
ing the  depth  of  the  pigmentation. 

11.  A  distribution  of  frequency  of  such  a  kind  that  the 
association  in  every  elementary  tetrad  is  of  the  same  sign 
possesses  several  useful  and  interesting  properties,  as  shown  in 
the  following  theorems.  It  will  be  termed  an  isotropic  dis- 
tribution. 

(1)  In  an  isotropic  distribution  the  sign  of  the  association  is 
the  same  not  only  for  every  elementary  tetrad  of  adjacent  frequen- 
cies^ hut  for  every  set  of  four  frequencies  in  the  compartments 
common  to  two  rows  and  two  columns^  e.g.  (A^B^),  (A^^pB^), 
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For  suppose  that  the  sign  of  association  in  the  elementary 
tetrads  is  positive,  so  that— 

and  similarly, 

Then  multiplying  up  and  cancelling  we  have 

That  is  to  say,  the  association  is  still  positive  though  the  two 
columns       and  A^^.^       no  longer  adjacent. 

(2)  An  isotropic  distribution  remains  isotropic  in  whatever  way 
it  may  be  condensed  by  grouping  together  adjacent  rows  or  columns. 

Thus  from  (1)  and  (3)  we  have,  adding — 

{A^B:)[{A^^,B,,^,)  +  (^.+A4-l)]  >(^n.^n-Hl)[(^n.+A)  +  (-^«.+2^„)], 

that  is  to  say,  the  sign  of  the  elementary  association  is  unaffected 
by  throwing  the  (m+  l)th  and  (m  +  2)th  columns  into  one. 

(3)  As  the  extreme  case  of  the  preceding  theorem,  we  may 
suppose  both  rows  and  columns  grouped  and  regrouped  until 
only  a  2  X  2-fold  table  is  left ;  we  then  have  the  theorem — 

If  an  isotropic  distribution  be  reduced  to  a  fourfold  distribution 
in  any  way  whatever,  by  addition  of  adjacent  rows  and  columns, 
the  sign  of  the  association  in  such  fourfold  table  is  the  same  as  in 
the  elementary  tetrads  of  the  original  table. 

The  case  of  complete  independence  is  a  special  case  of  isotropy. 
For  if 

(AM  =  {^m){Br.W 

for  all  values  of  m  and  n,  the  association  is  evidently  zero  for 
every  tetrad.  Therefore  the  distribution  remains  independent 
in  whatever  way  the  table  be  grouped,  or  in  whatever  way  the 
universe  be  limited  by  the  omission  of  rows  or  columns.  The 
expression  "  complete  independence  "  is  therefore  justified. 

From  the  work  of  the  preceding  section  we  may  say  that  Table 
II.  is  not  isotropic  as  it  stands,  but  may  be  regarded  as  a  dis- 
arrangement of  an  isotropic  distribution.  It  is  best  to  rearrange 
such  a  table  in  isotropic  order,  as  otherwise  different  reductions 
to  fourfold  form  may  lead  to  associations  of  different  sign,  though 
of  course  they  need  not  necessarily  do  so. 

12.  The  following  will  serve  as  an  illustration  of  a  table  that 
is  not  isotropic,  and  cannot  be  rendered  isotropic  by  any  rearrange- 
ment of  the  order  of  rows  and  columns. 


70 


THEORY  OF  STATISTICS. 


Table  IV. — Showing  the  Frequencies  of  Different  Combinations  of 
Eye- colours  in  Father  and  Son. 

(Data  of  Sir  F.  Galton,  from  Karl  Pearson,  Phil.  Trans.,  A,  vol.  cxcv. 
(1900),  p.  138  ;  classification  condensed.) 

1.  Blue.    2.  Blue-green,  grey.    3.  Dark  grey,  hazel.    4.  Brown. 


Father's  Eye-colour. 


1. 

2. 

3. 

4. 

Total. 

1 

]P4 

70 

41 

30 

335 

2 

83 

124 

41 

36 

284 

3 

25 

34 

55 

23 

137 

4 

56 

36 

43 

109 

244 

Total 

358 

264 

180 

T98 

1000 

The  following  are  the  ratios  of  the  frequency  in  column  m  to 
the  sum  of  the  frequencies  in  columns  m  and  w  +  1  : — 


Columns 


1  and  2.  2  and  3.  3  and  4. 

0-735  0-631  0-577 

0-401  0-752  0-532 

0-424  0-382  0-705 

0-609  0-456  0-283 


The  order  in  which  the  ratios  run  is  different  for  each  pair  of 
columns,  and  it  is  accordingly  impossible  to  make  the  table 
isotropic.  The  distribution  of  signs  of  association  in  the  several 
tetrads  is — 

+  -  + 

-  +  - 

-  -  + 

The  distribution  is  a  curious  one,  the  associations  in  tetrads 
round  the  diag-onal  of  the  whole  table  being  so  markedly  positive 
and  those  in  the  immediately  adjacent  tetrads  equally  markedly 
negative.  Neglecting  the  other  signs,  this  is  the  effect  that 
would  be  produced  by  taking  an  isotropic  distribution  and  then 
increasing  the  frequencies  in  the  diagonal  compartments  by  a 
sufficient  percentage.  Comparison  of  the  given  table  with  others 
from  the  same  source  shows  that  the  peculiarity  is  common  to 
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the  great  majority  of  the  tables,  and  accordingly  its  origin 
demands  explanation.  Were  such  a  table  treated  by  the  method 
of  the  contingency  coefficient,  or  a  similar  summary  method, 
alone,  the  peculiarity  might  not  be  remarked. 

13.  It  may  be  noted,  in  concluding  this  part  of  the  subject, 
that  in  the  case  of  complete  independence  the  distribution  of 
frequency  in  every  row  is  similar  to  the  distribution  in  the  row 
of  totals,  and  the  distribution  in  every  column  similar  to  that  in 
the  column  of  totals ;  for  in,  say,  the  column  A„  the  frequencies 
are  given  by  the  relations  — 

(A„B,)J^(B,),  (A„B,)J^\b,),  (A„B,)J-^{B)„ 

and  so  on.  This  property  is  of  special  importance  in  the  theory 
of  variables. 

14.  The  classifications  both  of  this  and  of  the  preceding  chapters 
have  one  important  characteristic  in  common,  viz.  that  they 
are,  so  to  speak,  "homogeneous" — the  principle  of  division 
being  the  same  for  all  the  sub-classes  of  any  one  class.  Thus 
^'s  and  a's  are  both  subdivided  into  B's  and  fS's,  A-^'s,  .... 
AJs  into  B^s  ....  B/a,  and  so  on.  Clearly  this  is  necessary 
in  order  to  render  possible  those  comparisons  on  which  the 
discussions  of  associations  and  contingencies  depend.  If  we 
only  know  that  amongst  the  ^'s  there  is  a  certain  percentage 
of  ^'s,  and  amongst  the  a's  a  certain  percentage  of  C's,  there 
are  no  data  for  any  conclusion. 

Many  classifications  are,  however,  essentially  of  a  heterogeneous 
character,  e.g.  biological  classifications  into  orders,  genera,  and 
species ;  the  classifications  of  the  causes  of  death  in  vital 
statistics,  and  of  occupations  in  the  census.  To  take  the  last 
case  as  an  illustration,  the  first  "order"  in  the  list  of  occupations 
is  "General  or  Local  Government  of  the  Country,"  subdivided 
under  the  headings  (1)  National  Government,  (2)  Local  Govern- 
ment. The  next  order  is  "  Defence  of  the  Country,"  with  the  sub- 
headings (1)  Army,  (2)  Navy  and  Marines — not  (1)  National 
and  (2)  Local  Government  again — the  sub-heads  are  necessarily 
distinct.  Similarly,  the  third  order  is  "  Professional  Occupations 
and  their  Subordinate  Services,"  with  the  fresh  sub-heads  (1) 
Clerical,  (2)  Legal,  (3)  Medical,  (4)  Teaching,  (5)  Literary  and 
Scientific,  (6)  Engineers  and  Surveyors,  (7)  Art,  Music,  Drama, 
(8)  Exhibitions,  Games,  etc.  The  number  of  sub-heads  under 
each  main  heading  is,  in  such  a  case,  arbitrary  and  variable, 
and  diff'erent  for  each  main  heading ;  but  so  long  as  the 
classification  remains  purely    heterogeneous,   however  complex 
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it  may  become,  there  is  no  opportunity  for  any  discussion 
of  causation  within  the  limits  of  the  matter  so  derived.  It  is 
only  when  a  homogeneous  division  is  in  some  way  introduced 
that  we  can  begin  to  speak  of  associations  and  contingencies. 

15.  This  may  be  done  in  various  ways  according  to  the 
nature  of  the  case.  Thus  the  relative  frequencies  of  different 
botanical  families,  genera,  or  species  may  be  discussed  in 
connection  with  the  topographical  characters  of  their  habitats — 
desert,  marsh,  or  moor — and  we  may  observe  statistical  associa- 
tions between  given  genera  and  situations  of  a  given  topographical 
type.  The  causes  of  death  may  be  classified  according  to  sex, 
or  age,  or  occupation,  and  it  then  becomes  possible  to  discuss 
the  association  of  a  given  cause  of  death  with  one  or  other 
of  the  two  sexes,  with  a  given  age-group,  or  with  a  given 
occupation.  Again,  the  classifications  of  deaths  and  of  occupations 
are  repeated  at  successive  intervals  of  time  ;  and  if  they  have 
remained  strictly  the  same,  it  is  also  possible  to  discuss  the 
association  of  a  given  occupation  or  a  given  cause  of  death  with 
the  earlier  or  later  year  of  observation — i.e.  to  see  whether  the 
numbers  of  those  engaged  in  the  given  occupation  or  succumbing 
to  the  given  cause  of  death  have  increased  or  decreased.  But 
in  such  circumstances  the  greatest  care  must  be  taken  to  see 
that  the  necessary  condition  as  to  the  identity  of  the  classifications 
at  the  two  periods  is  fulfilled,  and  unfortunately  it  very 
seldom  is  fulfilled.  All  practical  schemes  of  classification  are 
subject  to  alteration  and  improvement  from  time  to  time,  and 
these  alterations,  however  desirable  in  themselves,  render  a 
certain  number  of  comparisons  impossible.  Even  where  a 
classification  has  remained  verbally  the  same,  it  is  not  necessarily 
really  the  same;  thus,  in  the  case  of  the  causes  of  death, 
improved  methods  of  diagnosis  may  transfer  many  deaths  from 
one  heading  to  another  without  any  change  in  the  incidence 
of  the  disease,  and  so  bring  about  a  virtual  change  in  the 
classification.  In  any  case,  heterogeneous  classification  should 
be  regarded  only  as  a  partial  process,  incomplete  until  a 
homogeneous  division  is  introduced  either  directly  or  indirectly, 
e.g,  by  repetition. 

REFEREtTCES. 
Contingency. 

(1)  Pearson,  Karl,  "On  the  Theory  of  Contingency  and  its  Rflation  to 
Association  and  Normal  Correlation,"  Drapers  Company  Kescarch 
Memoirs,  Biometric  Scries  i.  ;  Dulau  &  Co.,  London,  1904.  (The 
memoir  in  which  the  coefficient  of  contingency  is  proposed. ) 
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(2)  LiPPS,   G.    F.,    "Die   Bestimmung  der  Abhangigkeit  zwischen  den 

Merkmalen  eines  Gegenstandes,"  Berichte  der  math.-phys.  Klasse  der 
kgl.  Sdchsischcn  Gesellschaft  der  Wissenschaften  ;  Leipzig,  1905.  (A 
general  discussion  of  the  problems  of  association  and  contingency.) 

(3)  Pearson,  Karl,  "  On  a  CoeflBcient  of  Class  Heterogeneity  or  Divergence," 

Biometrika,  vol.  v.  p.  198,  1906.  (An  application  of  the  contingency 
coeflacient  to  the  measurement  of  heterogeneity,  e.g.  in  different 
districts  of  a  country,  by  treating  the  observed  frequencies  of  some 
quality  Aj,  Ag  ....  An  in  the  different  districts  as  rows  of  a  con- 
tingency table  and  working  out  the  coeflBcient :  the  same  principle  is 
also  applicable  to  the  comparison  of  a  single  district  with  the  rest  of 
the  country. ) 

Isotropy. 

(4)  Yule,  G.  U.,  "On  a  Property  which  holds  good  for  all  Groupings  of  a 

Normal  Distribution  of  Frequency  for  Two  Variables,  with  applications 
to  the  Study  of  Contingency  Tables  for  the  Inheritance  of  Unmeasured 
Qualities,"  Proc.  Roy.  Soc,  Series  A,  vol.  Ixxvii.,  1906,  p.  324.  (On 
the  property  of  isotropy  and  some  applications. ) 

(5)  Yule,  G.  U.,  "On  the  Influence  of  Bias  and  of  Personal  Equation  in 

Statistics  of  Ill-defined  Qualities,"  Jour.  Anthrop.  Inst.,  vol.  xxxvi., 
1906,  p.  325.  (Includes  an  investigation  as  to  the  influence  of  bias 
and  of  personal  equation  in  creating  divergences  from  isotropy  in 
contingency  tables.) 

Contingency  Tables  of  two  Rows  only. 

(6)  Pearson,  Karl,  "On  a  New  Method  of  Determining  Correlation  between 

a  Measured  Character  A  and  a  Character  i?  of  which  only  the  Percentage 
of  Cases  wherein  B  exceeds  (or  falls  short  of)  a  given  Intensity  is  recorded 
for  each  Grade  of  A"  Biometrika,  vol.  vii.,  1909,  p.  96.  (Deals  with  a 
measure  of  dependence  for  a  common  type  of  table,  e.g.  a  table  showing 
the  numbers  of  candidates  who  passed  or  failed  at  an  examination,  for 
each  year  of  age.  The  table  of  such  a  type  stands  between  the  con- 
tingency tables  for  unmeasured  characters  and  the  correlation  table 
(chap.  IX.)  for  variables.  Pearson's  method  is  based  on  that  adopted 
for  the  correlation  table,  and  assumes  a  normal  distribution  of  fre- 
quency (chap.  XV.)  for  B.) 

(7)  Pearson,  Karl,  "  On  a  New  Method  of  Detennining  Correlation,  when 

one  Variable  is  given  by  Alternative  and  the  other  by  Multiple 
Categories,"  Biometrika,  vol.  vii.,  1910,  p.  248.  (The  similar 
problem  for  the  case  in  which  the  variable  is  replaced  by  an  un- 
measured quality.) 

EXERCISES. 

(1)  (Data  from  Karl  Pearson,  "  On  the  Inheritance  of  the  Mental  and  Moral 
Characters  in  Man,"  Jour,  of  the  Anthrop.  Inst.,  vol.  xxxiii.,  and  Biometrika, 
vol.  iii. )  Find  the  coefficient  of  contingency  (coefficient  of  mean  square 
contingency)  for  the  two  tables  below,  showing  the  resemblance  between 
brothers  for  athletic  capacity  and  between  sisters  for  temper.  Show  that 
neither  table  is  even  remotely  isotropic.  (As  stated  in  §  7,  the  coefficient  of 
contingency  should  not  as  a  rule  be  used  for  tables  smaller  than  5  x  5 -fold  : 
these  small  tables  are  given  to  illustrate  the  method,  while  avoiding  lengthy 
arithmetic. ) 
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A.    Athletic  Capacity. 


First  Brother. 


Athletic. 

Betwixt. 

Non- 
athletic. 

ToUl. 

Athletic 

Betwixt        .  . 
Non-athletic  .  , 

906 
20 
140 

20 
76 
9 

140 
9 

370 

1066 
105 
519 

Total 

1066 

105 

519 

1690 

B.  Temper. 


First  Sister, 


Quick. 

Good- 
natured. 

Sullen. 

Total. 

Quick  .... 

Good-natured 

Sullen 

198 
177 
77 

177 

996 
165 

77 

165 
120 



4  52 
1338 
362 

Total 

452 

1338 

362 

2152 

PART  II.— THE  THEORY  OF  VARIABLES. 


CHAPTER  VI. 

THE  FREQUENCY-DISTRIBUTION. 

1.  Introductory — 2.  Necessity  for  classification  of  observations  :  the  frequency 
distribution — 3.  Illustrations — 4.  Method  of  forming  the  table — 5. 
Magnitude  of  class-interval — 6.  Position  of  intervals — 7.  Process  of 
classification — 8.  Treatment  of  intermediate  observations— 9.  Tabula- 
tion— 10.  Tables  with  unequal  intervals — 11.  Graphical  representa- 
tion of  the  frequency-distribution — 12.  Ideal  frequency-distributions 
— 13.  The  symmetrical  distribution — 14.  The  moderately  asymmetri- 
cal distribution — 15.  The  extremely  asymmetrical  or  J-shaped  dis- 
tribution— 16.  The  U-shaped  distribution. 

1.  The  methods  described  in  Chaps.  I.-V.  are  applicable  to  all 
observations,  whether  qualitative  or  quantitative  ;  we  have  now 
to  proceed  to  the  consideration  of  specialised  processes,  definitely 
adapted  to  the  treatment  of  quantitative  measurements,  but  not 
as  a  rule  available  (with  some  important  exceptions,  as  suggested 
by  Chap.  I.  §  2)  for  the  discussion  of  purely  qualitative  observa- 
tions. Since  numerical  measurement  is  applied  only  in  the  case 
of  a  quantity  that  can  present  more  than  one  numerical  value, 
that  is,  a  varying  quantity,  or  more  shortly  a  variable,  this  section 
of  the  work  may  be  termed  the  theory  of  variables.  As  common 
examples  of  such  variables  that  are  subject  to  statistical  treat- 
ment may  be  cited  birth-  or  death-rates,  prices,  wages,  barometer 
readings,  rainfall  records,  and  measurements  or  enumerations  {e.g. 
of  glands,  spines,  or  petals)  on  animals  or  plants. 

2.  If  some  hundreds  or  thousands  of  values  of  a  variable  have 
been  noted  merely  in  the  arbitrary  order  in  which  they  happened 
to  occur,  the  mind  cannot  properly  grasp  the  significance  of  the 
record  :  the  observations  must  be  ranked  or  classified  in  some 
way  before  the  characteristics  of  the  series  can  be  comprehended, 
and  those  comparisons,  on  which  arguments  as  to  causation 
depend,  can  be  made  with  other  series.    The  dichotomous  cla«si- 
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fication,  considered  in  Chaps.  I. -IV.,  is  too  crude  :  if  the  values  ar^ 
merely  classified  as  ^'s  or  a's  according  as  they  exceed  or  fall 
short  of  some  fixed  value,  a  large  part  of  the  information  given 
by  the  original  record  is  lost.  A  manifold  classification,  however 
(c/.  Chap,  v.),  avoids  the  crudity  of  the  dichotomous  form,  since 
the  classes  may  be  made  as  numerous  as  we  please,  and  numerical 
measurements  lend  themselves  with  peculiar  readiness  to  a 
manifold  classification,  for  the  class  limits  can  be  conveniently 
and  precisely  defined  by  assigned  values  of  the  variable.  For 
convenience,  the  values  of  the  variable  chosen  to  define  the 
successive  classes  should  be  equidistant,  so  that  the  numbers  of 
observations  in  the  different  classes  (the  class-frequencies)  may  be 
comparable.  Thus  for  measurements  of  stature  the  interval 
chosen  for  classifying  (the  class-interval,  as  it  may  be  termed) 
might  be  1  inch,  or  2  centimetres,  the  numbers  of  individuals 
being  counted  whose  statures  fall  within  each  successive  inch,  or 
each  successive  2  centimetres,  of  the  scale ;  returns  of  birth-  or 
death-rates  might  be  grouped  to  the  nearest  unit  per  thousand 
of  the  population ;  returns  of  wages  might  be  classified  to  the 
nearest  shilling,  or,  if  desired  to  obtain  a  more  condensed  table, 
by  intervals  of  five  shillings  or  ten  shillings,  and  so  on.  When 
the  variation  is  discontinuous,  as  for  example  in  enumerations 
of  numbers  of  children  in  families  or  of  petals  on  flowers,  the 
unit  is  naturally  taken  as  the  class-interval  unless  the  range  of 
variation  is  very  great.  The  manner  in  which  the  observations 
are  distributed  over  the  successive  equal  intervals  of  the  scale  is 
spoken  of  as  the  frequency-distribution  of  the  variable. 

3.  A  few  illustrations  will  make  clearer  the  nature  of  such 
frequency-distributions,  and  the  service  which  they  render  in 
summarising  a  long  and  complex  record  : — 

(a)  Table  I.  In  this  illustration  the  mean  annual  death-rates, 
expressed  as  proportions  per  thousand  of  the  population  per 
annum,  of  the  632  registration  districts  of  England  and  Wales, 
for  the  decade  1881-90,  have  been  classified  to  the  nearest  unit ; 
i.e.  the  numbers  of  districts  have  been  counted  in  which  the 
death-rate  was  over  12  5  but  under  13*5,  over  13*5  but  under 
14*5,  and  so  on.  The  frequency-distribution  is  shown  by  the 
following  table. 


[Table  I. 
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Table  L — Shotving  the  Numbers  of  Registration  Districts  in  England  and 
Wales  with  Different  mean  Death-rates  per  Thousand  of  the  Population 
per  Annum  for  the  Ten  Years  1881-90.  (Material  from  the  Supplement 
to  the  56th  Annual  Report  of  the  Registrar -General  for  England  and 
?raZes[C.— 7769]  1895.) 


Numbor  of 

Mean  Annual 
Death-rate. 

J.^Xo  bl       L/O  WlUil 

DG&th-rELtc 

Mean  Annual 
Death-rate. 

Districts  with 

Death -rate 
between  Limits 

12*5-13-5 

5 

23 -5-24 -5 

5 

13-5-14-5 

16 

24-5-25-5 

3 

14'5-15-5 

61 

25 -5-26 -5 

1 

15-5-16-5 

112 

26-5-27-5 

1 

16 -5-17 -5 

159 

27-5-28-5 

2 

17'5-18-5 

104 

28-5-29-5 

18  5-19-5 

67 

29-5-30-5 

19-5-20 -5 

42 

30-5-31-5 

2 

20-5-21-5 

25 

31-5-32-5 

21-5-22-5 

18 

32-5-33-5 

'{ 

22-5-23-5 

8 

Total 

632 

Whilst  a  glance  through  the  original  returns  fails  to  convey 
any  very  definite  impression,  owing  to  the  large  and  erratic 
differences  between  the  death-rates  in  successive  districts,  a  brief 
inspection  of  the  above  table  brings  out  a  number  of  important 
points.  Thus  we  see  that  the  death-rates  range,  in  round 
numbers,  from  13  to  33  per  thousand  per  annum,  but  in  the 
great  majority  of  districts  lie  nearer  the  lower  limit  than  the 
upper ;  that  the  death-rates  in  some  60  per  cent,  of  the  districts 
lie  within  the  narrow  limits  15-5  to  18-5,  the  rates  being  most 
frequent  near  17  per  thousand,  and  so  forth. 

(b)  Table  II.  The  ages  at  death,  in  years,  of  the  married 
women  in  certain  Quaker  families  were  recorded  and  classified  in 
5-year  groups  according  as  they  were  over  17-5  but  under  22-5, 
over  22 -5  but  under  27*5,  and  so  on.  The  frequency-distribution 
was  as  follows  : — 


FTable  TI. 
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Table  II. — Showing  the  Numbers  of  Married  Women,  in  certain  Quaker 
Families,  Dying  at  Different  Ages.  (Cited  from  Proe.  Roy.  Soc,  vol.  Ixvii. 
(1900),  p.  172.  On  the  Correlation  between  Duration  of  Life  and  Number 
of  Offspring,  by  Miss  M.  Beeton,  Karl  Pearson,  and  G.  U.  Yule.) 


Number  of 

Number  of 

Age  at  Death, 
Years. 

Women  Dying 
between 
said  Years 
of  Age. 

Age  at  Death, 
Years. 

Women  Dying 
between 
said  Years 
of  Age. 

17 -5-22 -5 

29 

62-5-  67-5 

73 

22-5-27-5 

87 

67-5-  72-5 

83 

27 -5-32 -5 

99 

72-5-  77-5 

77 

32-5-37-5 

109 

77-5-  82-5 

78 

37-5-42-5 

90 

82-5-  87-5 

59 

42 -5-47 -5 

87 

87-5-  92-5 

26 

47-5-52-5 

64 

92-5-  97-5 

7 

52-5-57-5 

54 

97 -5-102 -5 

4 

57 -5-62 -5 

69 

Total 

1095 

The  distribution  is  somewhat  more  irregular  than  in  the  last 
case ;  the  commencement  is  abrupt ;  a  maximum  frequency  is 
attained  in  the  fourth  class  (age  at  death  32*5  to  37-5),  and  then 
there  is  a  slow  fall  to  the  age-class  52*5-57 '5.  After  this  class 
the  frequency  rises  again  and  attains  a  secondary  maximum  in 
the  age-class  67-5-72'5. 

(c)  Table  III.  The  numbers  of  stigmatic  rays  on  a  number 
of  Shirley  poppies  were  counted.  As  the  range  of  variation  is 
not  great,  the  unit  is  taken  as  the  class-interval.  The  frequency- 
distribution  is  given  by  the  following  table. 

Table  III. — Showing  the  Frequencies  of  Seed  Capsules  on  certain  Shirley 
Poppies,  with  Diferent  Numbers  of  Stigmatic  Rays.  (Cited  from 
Biometiika,  ii.  p.  89,  1902.) 


Number  of 

Number  of 

Number  of 

Capsules 

Number  of 

Capsules 

Stigmatic 

with  said 

Stigmatic 
Kays. 

with  said 

Rays. 

Number  of 

Number  of 

Stigmatic  Rays. 

Stigmatic  Rays. 

6 

3 

14 

302 

7 

11 

15 

234 

8 

38 

16 

128 

9 

106 

17 

50 

10 

ir>2 

18 

19 

11 

238 

19 

3 

12 

305 

20 

1 

13 

315 

Total 

1905 
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The  numbers  of  rays  range  from  6  to  20, — 12,  13,  or  14  rays 
being  the  most  usual, 

4.  To  expand  slightly  the  brief  description  given  in  §  2,  tables 
like  the  preceding  are  formed  in  the  following  way  : — (1)  The 
magnitude  of  the  class-interval,  i.e.  the  number  of  units  to  each 
interval,  is  first  fixed  ;  one  unit  was  chosen  in  the  case  of  Tables 
I.  and  III.,  five  units  in  the  case  of  Table  XL  (2)  The  position  or 
origin  of  the  intervals  must  then  be  determined,  e.g.  in  Table  I. 
we  must  decide  whether  to  take  as  intervals  12-13,  13-14,  14-15, 
etc.,  or  12-5-13-5,  13'5-14-5,  14-5-15-5,  etc.  (3)  This  choice 
having  been  made,  the  complete  scale  of  intervals  is  fixed,  and  the 
observations  are  classified  accordingly.  (4)  The  process  of 
classification  being  finished,  a  table  is  drawn  up  on  the  general 
lines  of  Tables  I.-IIL,  showing  the  total  numbers  of  observations 
in  each  class-interval.  Some  remarks  may  be  made  on  each  of 
these  heads. 

5.  Magjiitude  of  Class- Interval. — As  already  remarked,  in  cases 
where  the  variation  proceeds  by  discrete  steps  of  considerable 
magnitude  as  compared  with  the  range  of  variation,  there  is  very 
little  choice  as  regards  the  magnitude  of  the  class-interval.  The 
unit  will  in  general  have  to  serve.  But  if  the  variation  be  con- 
tinuous, or  at  least  take  place  by  discrete  steps  which  are  small 
in  comparison  with  the  whole  range  of  variation,  there  is  no  such 
natural  class-interval,  and  its  choice  is  a  matter  for  judgment. 

The  two  conditions  which  guide  the  choice  are  these :  {a)  we 
desire  to  be  able  to  treat  all  the  values  assigned  to  any  one  class, 
without  serious  error,  as  if  they  were  equal  to  the  mid-value 
of  the  class-interval,  e.g.  as  if  the  death-rate  of  every  district  in 
the  first  class  of  Table  I.  were  exactly  13*0,  the  death-rate  of 
every  district  in  the  second  class  14'0,  and  so  on;  (6)  for  con- 
venience and  brevity  we  desire  to  make  the  interval  as  large  as 
possible,  subject  to  the  first  condition.  These  conditions  will 
generally  be  fulfilled  if  the  interval  be  so  chosen  that  the  whole 
number  of  classes  lies  between  15  and  25.  A  number  of  classes 
less  than,  say,  ten  leads  in  general  to  very  appreciable  inaccuracy, 
and  a  number  over,  say,  thirty  makes  a  somewhat  unwieldy 
table.  A  preliminary  inspection  of  the  record  should  accordingly 
be  made  and  the  highest  and  lowest  values  be  picked  out. 
Dividing  the  difference  between  these  by,  say,  five  and  twenty,  we 
have  an  approximate  value  for  the  interval.  The  actual  value 
should  be  the  nearest  integer  or  simple  fraction. 

6.  Position  of  Intervals. — The  position  or  starting-point  of  the 
intervals  is,  as  a  rule,  more  or  less  indifferent,  but  in  general  it 
is  fixed  either  so  that  the  limits  of  intervals  are  integers,  or,  as  in 
Tables  1.  and  XL,  so  that  the  mid-values  are  integers.    Xt  may, 
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however,  be  chosen,  for  simplicity  in  classification,  so  that  no 
limit  corresponds  exactly  to  any  recorded  value  (c/.  §  8  below).  In 
some  exceptional  cases,  moreover,  the  observations  exhibit  a  marked 
clustering  round  certain  values,  e.g.  tens,  or  tens  and  fives.  This 
is  generally  the  case,  for  instance,  in  age  returns,  owing  to  the 
tendency  to  state  a  round  number  where  the  true  age  is  unknown. 
Under  such  circumstances,  the  values  round  which  there  is  a 
marked  tendency  to  cluster  should  preferably  be  made  mid-values 
of  intervals,  in  order  to  avoid  sensible  error  in  the  assumption  that 
the  mid-value  is  approximately  representative  of  the  values  in  the 
class.  Thus,  in  the  case  of  ages,  since  the  clustering  is  chiefly  round 
tens,  "  25  and  under  35,"  "  35  and  under  45,"  etc.,  the  classification 
of  the  English  census,  is  a  better  grouping  than  "  20  and  under 
30,"  "  30  and  under  40,"  and  so  on  (c/.  the  Census  of  England  and 
Wales^  1911,  vol.  vii.,  and  also  ref.  5,  in  which  a  different  view  is 
taken).  When  there  is  any  probability  of  a  clustering  of  this  kind 
occurring,  it  is  as  well  to  subject  the  raw  material  to  a  close 
examination  before  finally  fixing  the  classification. 

7.  Classification. — The  scale  of  intervals  having  been  fixed,  the 
observations  may  be  classified.  If  the  number  of  observations  is 
not  large,  it  will  be  sufficient  to  mark  the  limits  of  successive 
intervals  in  a  column  down  the  left-hand  side  of  a  sheet  of  paper, 
and  transfer  the  entries  of  the  original  record  to  this  sheet  by 
marking  a  1  on  the  line  corresponding  to  any  class  for  each  entry 
assigned  thereto.  It  saves  time  in  subsequent  totalling  if  each 
fifth  entry  in  a  class  is  marked  by  a  diagonal  across  the  preceding 
four,  or  by  leaving  a  space. 

The  disadvantage  in  this  process  is  that  it  offers  no  facilities  for 
checking :  if  a  repetition  of  the  classification  leads  to  a  different 
result,  there  is  no  means  of  tracing  the  error.  If  the  number  of 
observations  is  at  all  considerable  and  accuracy  is  essential,  it  is 
accordingly  better  to  enter  the  values  observed  on  cards,  one  to 
each  observation.  These  are  then  dealt  out  into  packs  according 
to  their  classes,  and  the  whole  work  checked  by  running  through 
the  pack  corresponding  to  each  class,  and  verifying  that  no  cards 
have  been  wrongly  sorted. 

8.  In  some  cases  difficulties  may  arise  in  classifying,  owing  to 
the  occurrence  of  observed  values  corresponding  to  class-limits. 
Thus,  in  compiling  Table  I.,  some  districts  will  have  been  noted 
with  death-rates  entered  in  the  Registrar-General's  returns  as 
16-5,  17 '5,  or  18*5,  any  one  of  which  might  at  first  sight  have 
been  apparently  assigned  indifferently  to  either  of  two  adjacent 
classes.  In  such  a  case,  however,  where  the  original  figures  for 
numbers  of  deaths  and  population  are  available,  the  difficulty  may 
be  readily  surmounted  by  working  out  the  rate  to  another  place 
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of  decimals :  if  the  rate  stated  to  be  16 -50  proves  to  be  16 '502,  it 
will  be  sorted  to  the  class  16'5-17*5  ;  if  16*498,  to  the  class 
15 '5-1 6 '5.  Death-rates  that  work  out  to  half-units  exactly  do 
not  occur  in  this  example,  and  so  there  is  no  real  difficulty.  In 
the  case  of  Table  II.,  again,  there  is  no  difficulty  :  if  the  year  of 
birth  and  death  alone  are  given,  the  age  at  death  is  only  calcul- 
able to  the  nearest  unit ;  if  the  actual  day  of  birth  and  death  be 
cited,  half-years  still  cannot  occur  in  the  age  at  death,  because 
there  is  an  odd  number  of  days  in  the  year.  The  difficulty  may 
always  be  avoided  if  it  be  borne  in  mind  in  fixing  the  limits 
to  class-intervals,  these  being  carried  to  a  further  place  of  decimals, 
or  a  smaller  fraction,  than  the  values  in  the  original  record.  Thus 
if  statures  are  measured  to  the  nearest  centimetre,  the  class- 
intervals  may  be  taken  as  150-5-151*5,  15r5-152'5,  etc.  ;  if  to 
the  nearest  eighth  of  an  inch,  the  intervals  may  be  59y|— 60Yf, 
60f|-61if,  and  so  on. 

If  the  difficulty  is  not  evaded  in  any  of  these  ways,  it  is 
usual  to  assign  one-half  of  an  intermediate  observation  to  each 
adjacent  class,  with  the  result  that  half-units  occur  in  the 
class-frequencies  (c/.  Tables  VII.,  p.  90,  X.,  p.  96,  and  XL, 
p.  96).  The  procedure  is  rough,  but  probably  good  enough  for 
practical  purposes ;  strict  precision  is  usually  unattainable,  for  in 
point  of  fact  the  odd  way  in  which  different  individuals  read  a 
scale  {cf.  Supplement  1.)  renders  it  impossible  to  assign  exact 
limits  to  intervals. 

9.  Tabulation. — As  regards  the  actual  drafting  of  the  final 
table,  there  is  little  to  be  said,  except  that  care  should  be  taken 
to  express  the  class-limits  clearly,  and,  if  necessary,  to  state  the 
manner  in  which  the  difficulty  of  intermediate  values  has  been 
met  or  evaded.  The  class-limits  are  perhaps  best  given  as  in 
Tables  I.  and  II.,  but  may  be  more  briefly  indicated  by  the  mid- 
values  of  the  class-intervals.  Thus  Table  I.  might  have  been 
given  in  the  form — 

Death-rate  per  1000  Number  of 

per  annum  to  the  Districts  with 

Nearest  Unit.  said  Death-rate. 

13  5 

14  16 

15  61 

16  112 
etc.  etc. 

A  common  mode  of  defining  the  class-intervals  is  to  state  the 
limits  in  the  form  "a;  and  less  than  yJ'  In  the  case  of  measure- 
ments of  stature,  for  example,  the  table  might  run — 

6 


82 


THEORY  OF  STATISTICS. 


Stature  in  Inches. 

57  and  less  than  58 

58  „       „  59 

59  „       „  60 
etc. 

— the  statement "  57  and  less  Mian  58,"  etc.,  being  often  abbreviated 
to  57-  58-,  59-  etc.  (c/.  Table  VL,  p.  88).  The  mode  of  grouping 
is,  in  effect,  that  described  in  the  last  paragraph  as  of  service  in 
avoiding  intermediate  observations,  but  it  should  be  noted  that  the 
form  of  statement  leaves  the  class-limits  uncertain  unless  the  degree 
of  accuracy  of  the  measurements  is  also  given.  Thus,  if  measure- 
ments were  taken  to  the  nearest  eighth  of  an  inch,  the  class- 
limits  are  really  56i|-57if,  57^|-58^,  etc. ;  if  they  were 
only  taken  to  the  nearest  quarter  of  an  inch,  the  limits  are  56| 
-57|-,  57|^-58|^,  etc.  With  such  a  form  of  tabulation  a  state- 
ment as  to  the  number  of  significant  figures  in  the  original 
record  is  therefore  essential.  It  is  better,  perhaps,  to  state  the 
true  class-limits  and  avoid  ambiguity. 

10.  The  rule  that  class-intervals  should  be  all  equal  is  one 
that  is  very  frequently  broken  in  ofiicial  statistical  publications, 
principally  in  order  to  condense  an  otherwise  unwieldy  table, 
thus  not  only  saving  space  in  printing  but  also  considerable 
expense  in  compilation,  or  possibly,  in  the  case  of  confidential 
figures,  to  avoid  giving  a  class  which  would  contain  only  one  or 
two  observations,  the  identity  of  which  might  be  guessed.  It 
would  hardly  be  legitimate,  for  example,  to  give  a  return  of 
incomes  relating  to  a  limited  district  in  such  a  form  that  the 
income  of  the  two  or  three  wealthiest  men  in  the  district  would 
be  clear  to  any  intelligent  reader  with  local  knowledge.  If  the 
intervals  be  made  unequal,  the  application  of  many  statistical 
methods  is  rendered  awkward,  or  even  impossible,  and  the 
relative  values  of  the  frequencies  are  at  first  sight  misleading,  so 
that  the  table  is  not  perspicuous.  Thus,  consider  the  first  tw^o 
columns  of  Table  IV.,  showing  the  numbers  of  dwelling-houses 
of  different  annual  values,  assessed  to  inhabited  house  duty.  On 
running  the  eye  down  the  column  headed  "number  of  houses"  it 
is  at  once  caught  by  the  two  striking  irregularities  at  the  classes 
"£60  and  under  £80,"  and  "£100  and  under  £150."  But  these 
have  no  real  significance ;  they  are  merely  due  to  changes  from 
a  £10  to  a  £20,  and  then  to  a  £50  interval.  Moreover,  the 
intervals  after  £150  go  on  continuously  increasing,  but  attention 
is  not  directed  thereto  by  any  marked  changes  in  the  frequencies. 
To  make  the  latter  really  comparable  inter  se,  they  must  first  be 


Number  of 
Observations. 

2 
4 
14 
etc. 
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Table  IV. — Showing  the  Animal  Value  and  Number  of  Dwelling-houses  in 
Great  Britain  assessed  to  Inhabited  House  Duty  in  1885-6.  (Cited  from 
Jour.  Roy.  Stat.  Soc,  vol.  1.,  1887,  p.  610.) 


Annual  Value  in  £'s. 

Number 
of  Houses. 

Frequency 
per  £10 
Interval. 

£20  and  under  £30 
30  40 
40        ,,  50 
50  60 
60         „  80 
80        „  100 
100        „  150 
150        „  300 
300         „  500 
500  1000 
1000  and  upwards 

Total  number  of  liouses 

306,408 
182,972 
105,407 
63,096 
71,436 
32,365 
41,336 
26,732 
6,198 
2,098 
644 

306,408 
182,972 
1  Uy,4U/ 
63,096 
35,718 
16,182 
8,267 
1,782 
310 
42 

1 

838,692 

reduced  to  a  common  interval  as  basis,  e.g.  £10,  by  dividing  the 
fifth  and  sixth  numbers  by  2,  the  seventh  by  5,  the  eighth  by  15, 
and  so  on.  This  gives  the  mean  frequencies  per  <£10  interval 
tabulated  in  the  third  column  of  Table  IV.  The  reduction  is, 
however,  impossible  in  the  case  of  the  last  class,  for  we  are  only 
told  the  number  of  houses  of  £1000  annual  value  and  upwards: 
the  magnitude  of  the  class  is  indefinite.  Such  an  indefinite  class 
is  in  many  respects  a  great  inconvenience,  and  should  always  be 
avoided  in  work  not  subject  to  the  necessary  limitations  of 
official  publications. 

The  general  rule  that  intervals  should  be  equal  must  not  be 
held  to  bar  the  analysis  by  smaller  equal  intervals  of  some 
portion  of  the  range  over  which  the  frequency  varies  very 
rapidly.  In  Table  XII,,  p.  98,  for  example,  giving  the  numbers 
of  deaths  from  diphtheria  at  successive  ages,  a  five-year  interval 
might  be  substituted  with  advantage  for  the  irregular  intervals 
after  the  fifth  year  of  age,  but  it  would  still  be  desirable  to  give 
the  numbers  of  deaths  in  each  year  for  the  first  five  years,  so  as 
to  bring  out  the  rapid  rise  to  the  maximum  in  the  fourth  year 
of  life. 

11.  When  the  table  has  been  completed,  it  is  often  convenient 
to  represent  the  frequency-distribution  by  means  of  a  diagram 
which  conveys  the  general  run  of  the  observations  to  the  eye 
better  than  a  column  of  figures.    The  following  short  table. 
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giving  the  distribution  of  head-breadths  for  1000  men,  will  serve 
as  an  example. 

Table  V. — Showing  the  Frequency -distributimi  of  Head-hreadths  for  Students 
at  Cambridge.  Measurements  ialen  to  the  nearest  tenth  of  an  inch. 
(Cited  from  W.  R.  Macdonell,  Biometrika,  i.,  1902,  p.  220.) 


H ead - bread th 
in  Inches. 

Number  of 
Men  with  said 
Head-breadth. 

Head-breadth 
in  Inches. 

Number  of 
Men  with  said 
Head- breadth. 

5-5 
5  6 
5-7 
5-8 

5-  9 

6-  0 
6-1 
6-2 

3 
12 

43 
80 
131 
236 
185 
142 

6-3 
6-4 
6-5 
6-6 
6-7 
6-8 

99 
37 
15 
12 
3 
2 

Total 

1000 

Taking  a  piece  of  squared  paper  ruled,  say,  in  inches  and  tenths, 
mark  off  along  a  horizontal  base-line  a  scale  representing  class- 
intervals  ;  a  half-inch  to  the  class-interval  would  be  suitable. 
Then  choose  a  vertical  scale  for  the  class-frequencies,  say  50 
observations  per  interval  to  the  inch,  and  mark  oft',  on  the 
verticals  or  ordinates  through  the  points  marked  5 '5,  5"6,  5'7 
.  .  .  .  at  the  centres  of  the  class- intervals  on  the  base-line,  heights 
representing  on  this  scale  the  class-frequencies  3,  12,  43.  .  .  . 
The  diagram  may  then  be  completed  in  one  of  two  ways:  (1) 
as  a  frequency  polygon,  by  joining  up  the  marks  on  the  ver- 
ticals by  straight  lines,  the  last  points  at  each  end  being  joined 
down  to  the  base  at  the  centre  of  the  next  class-interval  (fig.  1) ; 
or  (2)  as  a  column  diagram  or  histogram  (to  use  a  term  sug- 
gested by  Professor  Pearson,  ref.  1),  short  horizontals  being  drawn 
through  the' marks  on  the  verticals  (fig.  2),  which  now  form  the 
central  axes  of  a  series  of  rectangles  representing  the  class- 
frequencies.  The  student  should  note  that  in  any  such  diagram, 
of  either  form,  a  certain  a?'ea  represents  a  given  number  of 
observations.  On  the  scales  suggested,  1  inch  on  the  horizontal 
represents  2  intervals,  and  1  inch  on  the  vertical  represents  50 
observations  per  interval :  1  square  inch  therefore  represents 
50x  2  =  100  observations.  The  diagrams  are,  however,  con- 
ventional :  the  whole  area  of  tlie  figure  is  correct  in  either  case, 
but  the  area  over  each  interval  is  not  correct  in  the  case  of  the 
frequency-polygon,  and  the  frequency  of  each  fraction  of  any 
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5-5     6      -7      '6      '9     60     -1       -2      -3       V       5       6       7  3 

HccLci  hreojcUh.  in.  inches 
Fig.  1. — Frequency-Polygon  for  Head-breadths  of  1000  Cambridge 
Students.    (Table  V.) 


55      6       7      a       -5     60      1      -2      -3      4^       5      6       7  8 
Head.  hrcadUK  Uv  inches. 
Fig.  2. — Histogram  for  the  same  data  as  Fig.  1. 
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interval  is  not  the  same,  as  suggested  by  the  histogram.  The 
area  shown  by  the  frequency-polygon  over  any  interval  with  an 
ordinate        (fig.  3)  is  only  correct  if  the  tops  of  the  three 


successive  ordinates  y^^  y^,  V^^^^  on  a  lino,  ie.  if  ?/2  =  1(^1 +^3), 
the  areas  of  the  two  little  triangles  shaded  in  the  figure  being 
equal.    If  y^  fall  short  of  this  value,  the  area  shown  by  the 


^2 

Fig.  4. 


polygon  is  too  great;  if  exceed  it,  the  area  shown  by  the 
polygon  is  too  small ;  and  if,  for  this  reason,  the  frequency- 
polygon  tends  to  become  very  misleading  at  an}'-  part  of  the 
range,  it  is  better  to  use  the  histogram.  In  the  mortality  dis- 
tribution of  Table  I.,  for  instance,  the  frequency  rises  so  sharply 
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to  the  maximum  that  a  histogram  is,  on  the  whole,  the  better  re- 
presentation of  the  distribution  of  frequency,  and  in  such  a 
distribution  as  that  of  Table  IV.  the  use  of  the  histogram  is 
almost  imperative. 

12.  If  the  class-interval  be  made  smaller  and  smaller,  and  at 
the  same  time  the  number  of  observations  be  proportionately  in- 
creased, so  that  the  class-frequencies  may  remain  finite,  the 
polygon  and  the  histogram  will  approach  more  and  more  closely 
to  a  smooth  curve.  Such  an  ideal  limit  to  the  frequency-polygon 
or  histogram  is  termed  a  frequency-curve.  In  this  ideal  frequency- 
curve  the  area  between  any  two  ordinates  whatever  is  strictly 
proportional  to  the  number  of  observations  falling  between  the 
corresponding  values  of  the  variable.  Thus  the  number  of 
observations  falling  between  the  values  and  of  the  variable 
in  fig.  4  will  be  proportional  to  the  area  of  the  shaded  strip  in  the 
figure;  the  number  of  observed  values  greater  than  will 
similarly  be  given  by  the  area  of  the  curve  to  the  right  of  the 
ordinate  through  x^,  and  so  on.  When,  in  any  actual  case,  the 
number  of  observations  is  considerable — say  a  thousand  at  least 
— the  run  of  the  class-frequencies  is  generally  sufficiently 
smooth  to  give  a  good  notion  of  the  form  of  the  ideal  distri- 
bution ;  with  small  numbers  the  frequencies  may  present  all 
kinds  of  irregularities,  which,  most  probably,  have  very  little 
significance  {cf.  Chap.  XV.  §  15,  and  §  18,  Ex.  iv.).  The  forms 
presented  by  smoothly  running  sets  of  numerous  observations 
present  an  almost  endless  variety,  but  amongst  these  we  notice 
a  small  number  of  comparatively  simple  types,  from  which  many 
at  least  of  the  more  complex  distributions  may  be  conceived  as 
compounded.  For  elementary  purposes  it  is  sufficient  to  consider 
these  fundamental  simple  types  as  four  in  number,  the  symmetri- 
cal distribution,  the  moderately  asymmetrical  distribution,  the 
extremely  asymmetrical  or  J-shaped  distribution,  and  the  U-shaped 
distribution. 

13.  The  symmetrical  distribution,  the  class-frequencies  decreas- 
ing to  zero  symmetrically  on  either  side  of  a  central  maximum. 
Fig.  5  illustrates  the  ideal  form  of  the  distribution. 

Being  a  special  case  of  the  more  general  type  described  under 
the  second  heading,  this  form  of  distribution  is  comparatively  rare 
under  any  circumstances,  and  very  exceptional  indeed  in  economic 
statistics-  It  occurs  more  frequently  in  the  case  of  biometric,  more 
especially  anthropometric,  measurements,  from  which  the  following 
illustrations  are  drawn,  and  is  important  in  much  theoretical  work. 
Table  VI.  shows  the  frequency-distribution  of  statures  for  adult 
males  in  the  British  Isles,  from  data  published  by  a  British 
Association  Committee  in  1883,  the  figures  being  given  separately 
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Tabt.e  VI. — Showing  the  Frequency -distributions  of  Statures  for  Adult 
Males  horn  in  England,  Ireland,  Scotland,  and  Wales.  Final  Report  of 
the  Anthropometric  Committee  to  the  British  Association.  {Report,  1883, 
p,  256.)  As  Meas%irevients  are  stated,  to  have  been  taken  to  the  nearest 
J-th  of  an  Inch,  the  Class- Intervals  are  here  presumably  56^|-57lf, 
awe?  so  on  {cf.  §  9).    See  Fig.  6. 


Number  of  Men  within  said  Limits  of  Height. 
Place  of  Birth — 

Total. 

England. 

Scotland. 

Wales. 

Ireland. 

57- 

1 

1 

2 

58- 

3 

1 

4 

59- 

12 

1 

1 

14 

60- 

39 

2 

41 

61- 

70 

2 

9 

2 

83 

62- 

128 

9 

30 

2 

169 

63- 

320 

19 

48 

7 

394 

64— 

524 

47 

83 

15 

669 

65- 

740 

109 

108 

33 

990 

66- 

881 

139 

145 

58 

1223 

67- 

918 

210 

128 

73 

1329 

68- 

886 

210 

72 

62 

1230 

69- 

753 

218 

52 

40 

1063 

70- 

473 

115 

33 

25 

646 

71- 

254 

102 

21 

15 

392 

72- 

117 

69 

6 

10 

202 

73- 

48 

26 

2 

3 

79 

74- 

16 

15 

1 

32 

75- 

9 

6 

1 

16 

76- 

1 

4 

5 

77- 

1 

1 

2 

Total 

6194 

1304 

741 

346 

8585 

for  persons  born  in  England,  Scotland,  Wales,  and  Ireland,  and 
totalled  in  the  last  column.  These  frequency-distributions  are 
approximately  of  the  symmetrical  type.  The  freqnency-polj^gon 
for  the  totals  given  by  the  last  column  of  the  table  is  shown 
in  fig.  6.  The  student  will  notice  that  an  error  of  Jg-  inch, 
scarcely  appreciable  in  the  diagram  on  its  reduced  scale,  is  neglected 
in  the  scale  shown  on  the  base-line,  the  intervals  being  treated 
as  if  they  were  57-58,  58-59,  etc.  Diagrams  should  be  drawn  for 
comparison  showing,  to  a  good  open  scale,  the  separate  distributions 
for  England,  Scotland,  Wales,  and  Ireland. 
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Fig.  5. — An  ideal  symmetrical  Frequency-distribution. 




58     60     62     64     66     68     70     72  74 
Staiure  in,  irvcTves. 


76      73  60 


Fio.  6.— Frequency-distribution  of  Stature  for  8585  Adult  Males  born  in 
the  British  Isles.    (Table  VI.) 
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Table  VII.  gives  two  similar  distributions  from  more  recent 
investigations,  relating  respectively  to  sons  over  18  years  of 
age,  with  parents  living,  in  Great  Britain,  and  to  students  at 
Cambridge.  The  polygons  are  shown  in  figs.  7  and  8.  Both  these 
distributions  are  more  irregular  than  that  of  fig.  6,  but,  roughly 
speaking,  they  may  all  be  held  to  be  approximately  symmetrical. 

14.  The  moderately  asymmetrical  distribution,  the  class-fre- 
quencies decreasing  with  markedly  greater  rapidity  on  one  side  of 
the  maximum  than  on  the  other,  as  in  fig.  9  [a)  or  (6).  This  is 
the  most  common  of  all  smooth  forms  of  frequency-distribution, 
illustrations  occurring  in  statistics  from  almost  every  source.  The 
distribution  of  death-rates  in  the  registration  districts  of  England 

Table  VII. — Shoxoing  the  Frequency -distribution  of  Statures  for  (1)  1078 
English  Sons  (Karl  Pearson,  Biomctrika,  ii.,  1903,  p.  415)  ;  (2)  for  1000 
Male  Students  at  Cambridge  (W.  R.  Macdonell,  Biometrika,  i.,  1902, 
p.  220).    See  Figs.  7  and  8. 


Number  of  Men  within  said 
Limits  of  Stature. 


Stature  in 
Inches. 


(2) 
Cambridge 
Students. 


(1) 

English  Sons. 


59 -  5-60 -5 

60-  5-61 -5 

61 -  5-62 -5 

62 -  5-63 -5 

63 -  5-64 -5 

64 -  5-65 -5 

65 -  5-66 -5 
66  ■5-67-5 

67 -  5-68 -5 

68-  5-69-5 

69  •5-70-5 

70  •.5-71-5 
7r5-72-5 

72-  5-73-5 

73 -  5-74 -5 
74  •5-75-5 

75-  5-76-5 

76-  5-77-5 

77-  5-78-5 

78-  5-79-5 


20 
1-5 

3-  5 
20-5 
38-5 
61-5 
89-5 

148-  0 
173-5 

149-  5 
128-0 
108^0 

63  0 
42-0 
29^0 
8-5 
4  0 

4-  0 
3-0 
0-5 


4-0 

190 
24-5 
40-5 
84-5 
123-5 
139-0 
179-0 
138-5 
108-0 
53-5 
47-5 
21-0 
12-0 
50 
0-6 


Total 


1078 


1000 
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Y\Cr.  7.— Frequency-distribution  of  Stature  for  1078  "  English  Sons.' 
(Table  VII.) 
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Fig.  8. — Frequency -distribution  of  Stature  for  1000  Cambridge 
Students.    (Table  VII.) 
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and  Wales,  given  in  Table  I.,  p.  77,  is  a  somewhat  rough  example 
of  the  type.    The  distribution  of  rates  of  pauperism  in  the  same 


Fio  9. —Ideal  distributions  of  the  moderately  asymmetrical  form. 

districts  (Table  VIII.  and  fig.  10)  is  smoother  and  more  like  the 
type  (a)  of  fig    9.     The   frequency  attains  a  maximum  for 


wo- 

00- 

80- 

70- 

60- 

50- 

40- 

30- 

20- 

10- 

oi 

0 

\ 


1       234507a       9      to  II 
Percenijogc  of  the  popuJLcUitfrt  in.  receipt  of  relief  ■ 

Fig.  10. — Frequency-distribution  of  Pauperism  (Percentage  of  the  Popula  tion 
in  Receipt  of  Poor-law  Eelief)  on  1st  .lanuary  1891  in  the  Registration 
Districts  of  England  and  Wales ;  632  Districts.    (Table  VIII.) 
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districts  with  2|  to  3J  per  cent,  of  the  population  in  receipt  of 
relief,  and  then  tails  oif  slowly  to  unions  with  6,  7,  and  8  per 
cent,  of  pauperism. 

Table  YUl.—Sho^ving  the  Number  of  Registration  Districts  in  England  and 
Wales  with  Different  Percentages  of  the  Population  in  receipt  of  Poor-law 
Relief  on  the  \st  January  {Yu\e,  Jour.  Roy.  Stat.  Soc,  vol.  lix., 

1896,  p.  347,  q.v.  for  distributions  for  earlier  years.)    See  Fig.  10. 


X  ercentage  oi 

the  Population 

in  receipt  of 
Pol'  f 
iteiier. 

Number  of 
Unions  with 
given  Percent- 
age in  receipt 
of  Relief. 

0 '75-1 -25 

18 

1-25-1  75 

48 

r75-2-25 

72 

2-25-2-75 

89 

2 -75-3 -25 

100 

3 -25-3 -75 

90 

3  75-4-25 

75 

4-25-4  75 

60 

4-7.5-5-25 

40 

5-25-5  75 

21 

5 -75-6 -25 

11 

6-25-6-75 

5 

67.5-7-25 

1 

7-25-775 

1 

7 -75-8 -25 

0 

8-25-875 

1 

Total 

632 

While  the  distribution  of  stature  is  in  general  symmetrical,  that 
of  weight  is  asymmetrical  or  skew,  the  greater  frequencies  lying 
towards  the  lower  end  of  the  range.  This  is  shown  very  well  by 
the  data  (Table  IX.  and  fig.  11)  collected  by  the  same  British 
Association  Committee,  from  the  Report  of  which  the  data  as  to 
stature  were  cited  in  tlie  last  section.  As  in  the  case  of  the  stature 
diagram  (fig.  6),  the  small  error  of  -|  lb.  has  been  neglected,  for 
the  sake  of  brevity,  in  lettering  the  base-line  of  fig.  11,  the  classes 
being  treated  as  if  they  were  90  Ib.-lOO  lb.,  100  Ib.-llO  lb., 
and  so  on. 

Table  X.  and  fig.  12  give  a  biological  illustration,  viz.  the 
distribution  of  fecundity  (ratio  of  yearling  foals  produced  to 
coverings)  in  mares.    The  student  should  notice  the  difl&culty 


94 


THEORY  OF  STATISTICS. 


Weight   trv  Lbs 

Fig.  11.— Frequency-distribution  of  Weight  for  7749  Adult  Males  in 
the  British  Isles.    (Table  IX. ) 


0     ]/]5  2jl5  3ll5  ^Il5  5/35  6p5  7/j5  6jl5  5/75  Wj-iS  vJlS  32/l5  iSllS  w/Zo  / 

Ratio  of  YearUng  foals  prvdiicecL   to  coverinrjs . 

Fig.  12.— Frequency-distribution  of  Fecundity  for  Brood-mares*. 
2000  observations.    (Table  X.) 
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Table  IX.  —Showing  the  Frequency-distribution  of  Weights  for  Adult  Males 
horn  in  England,  Ireland,  Scotland,  and  Wales.  {Loc.  cit.,  Table  VI.) 
Weights  were  taken  to  the  nearest  pound,  consequently  the  true  Class- 
Intervals  are  89 -5-99 '5,  99-5-109'5,  etc.  (§  9). 




Number  of  Men  within  given  Limits  of 

Weight 

Weight.    Place  of  Birth- 

Tntnl 
X  ULal. 

iu  lbs. 



jiingland. 

Wales 

T    ^  A 
Irelana. 

90- 

2 

— 

2 

100- 

26 

1 

1 

5 

34 

110- 

133 

8 

10 

1 

152 

120- 

338 

22 

A6 

7 

390 

130- 

694 

63 

68 

42 

867 

140- 

1240 

173 

153 

57 

1623 

150- 

1075 

255 

178 

51 

1559 

160- 

881 

275 

134 

36 

1326 

170- 

492 

168 

102 

25 

787 

180- 

304 

125 

34 

13 

476 

190- 

174 
111 

67 

14 

3 

263 

200- 

75 

24 

7 

1 

107 

210- 

62 

14 

8 

1 

85 

220- 

33 

7 

1 

41 

230- 

10 

4 

2 

16 

240- 

9 

2 

11 

250- 

3 

4 

1 

8 

260- 

1 

1 

270- 
280- 

1 

1 

Total 

5552 

1212 

738 

247 

7749 

of  classification  in  this  case:  the  class- interval  chosen  throughout 
the  middle  of  the  range  is  1/1 5th,  but  the  last  interval  is 
"  29/30-1."  This  is  not  a  whole  interval,  but  it  is  more  than  a 
half,-  for  all  the  cases  of  complete  fecundity  are  reckoned  into  the 
class.  In  the  diagram  (fig.  12)  it  has  been  reckoned  as  a  whole 
class,  and  this  gives  a  smooth  distribution. 

To  take  an  illustration  from  meteorology,  the  distribution  of 
barometer  heights  at  any  one  station  over  a  period  of  time  is,  in 
general,  asymmetrical,  the  most  frequent  heights  lying  towards  the 
upper  end  of  the  range  for  stations  in  England  and  Wales. 
Table  XI.  and  fig.  13  show  the  distribution  for  daily  observations 
at  Southampton  during  the  years  1878-90  inclusive. 

The  distributions  of  Tables  VIIT.-XT.  all  follow  more  or  less  the 
type  of  fig.  9  (a),  the  frequency  tailing  olF,  at  the  steeper  end  of 
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Table  X. — Showing  the  Frequency-distribution  of  Fecundity^  i.e.  the  Ratio 
of  the  Number  of  Yearling  Foals  produced  to  the  Number  of  Coverings, 
for  Brood-mares  {Race-horses)  Covered  Eight  Times  at  Least.  (Pearson, 
Lee,  and  Moore,  Phil.  Trans.,  A,  vol.  cxcii.  (1899),  p.  303.)    See  Fig.  12. 


Number  of 

Number  of 

Mares  with 

Mares  with 

Fecundity. 

Fecundity 

Fecundity. 

Fecundity 

between  the 

between  the 

Given  Limits. 

Given  Limits. 

1/30-  3/30 

2 

17/30-19/30 

315 

3/30-  5/30 

7-5 

19/30-21/30 

337 

5/30-  7/30 

11-5 

21/30-23/30 

293*5 

7/30-  9/30 

21-5 

23/30-25/30 

204 

9/30-11/30 

55 

25/30-27/30 

127 

n/30-13/30 

104-5 

27/30-29/30 

49 

13/30-15/30 

29/30-1 

19 

15/30-17/30 

271-5 

Total 

2000-0 

Table  XL  —  Showing  the  Frequency -distribution  of  Barometer  Heights  for 
Daily  Observations  during  the  Thirteen  Years  1878-1890  at  Southampton. 
(Karl  Pearson  and  A.  Lee,  Phil.  Trans  ,  A,  vol.  cxc.  (1897),  p.  428,  q.v. 
for  numerous  other  distributions. )    See  Fig.  13. 


Height  of 
Barometer 
in  Inches. 

Number  of  Days 
on  which  Height 

was  observed 
between  the 

Given  Limits. 

Height  of 
Barometer 
in  Inches. 

Number  of  Days 
on  which  Height 
was  observed 
between  the 
Given  Limits. 

28-45-28-55 

1 

29 -Sn-  -95 

548^5 

•55-  -65 

2 

•95-30^05 

602-5 

•65-  -75 

2 

30^05-  ^15 

619-5 

•75-  ^85 

4 

•15^  -25 

500 

•85-  -95 

8-5 

•25-  ^35 

382 

•95-29-05 

13^5 

•35-  ^45 

237^5 

29-05-  ^15 

21-5 

•45-  •SS 

189-5 

•15-  -25 

37 

•55-  -65 

88-5 

•25-  -35 

79 

-65-  '75 

43^5 

•35-  -45 

108 

•75-  -85 

7 

•45-  ^55 

181^5 

•85-  -95 

4 

•55-  -65 

254-5 

30 -95-31  •OS 

1 

•65-  '75 

348-5 

•75-  ^85 

463-5 

•  Total 

4748 
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Za5  29  295  30  305  31 

Height  trv  inches 

Fig.  13.— Frequency-distribution  of  Barometer  Heights  at 
Southampton:  4748  observations.    (Table  XI.) 
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Fig.  14. — Frequency-distribution  of  Deaths  from  Diphtheria  at  different  Ages 
in  England  and  Wales,  1891-1900.    (Table  XII.) 
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the  distribution,  in  such  a  way  as  to  suggest  that  the  ideal 
curve  is  tangential  to  the  base.  Cases  of  greater  asymmetry, 
suggesting  an  ideal  curve  that  meets  the  base  (at  one  end)  at  a 
finite  angle,  even  a  right  angle,  as  in  fig.  9  (6),  are  less  frequent, 
but  occur  occasionally.  The  distribution  of  deaths  from  diphtheria, 
according  to  age,  affords  one  such  example  of  a  more  asymmetrical 
kind.  The  actual  figures  for  this  case  are  given  in  Table  XIL,  and 
illustrated  by  fig.  14;  and  it  will  be  seen  that  the  frequency  of 
deaths  reaches  a  maximum  for  children  aged  "  3  and  under  4," 
the  number  rising  very  rapidly  to  the  maximum,  and  thence 
falling  so  slowly  that  there  is  still  an  appreciable  frequency  for 
persons  over  60  or  70  years  of  age. 

Table  XII. — Showing  the  Numbers  of  DeatJis  from  Diphtheria  at  Different 
Ages  in  England  and  Wales  during  the  Ten  Years  1891-1900.  {Supple- 
ment to  Qbth  Annual  Report  of  the  Registrar-General,  1891-1900,  p.  3.) 
See  Fig.  14. 


Number  of 

Age  in  Years. 

Deaths  between 

Number 

Given  Limits 

per  Annum. 

of  Age. 

Under  1  year 

4,186 

4,186 

1- 

10,491 

10,491 

2- 

11,218 

11,218 

3- 

12,390 

12,390 

4- 

11,194 

11,194 

5- 

23,348 

4,670 

10- 

4,092 

818 

15- 

1,123 

225 

20- 

585 

117 

25- 

786 

79 

35- 

512 

51 

45- 

324 

32 

55- 

260 

26 

65- 

127 

13 

75  and  upwards 

35 

Total 

80,671 

15.  The  extremely  asymmetrical,  or  J-shaped"  distribution,  the 
class-frequencies  running  up  to  a  maximum  at  one  end  of  the 
range,  as  in  fig.  15. 

This  may  be  regarded  as  the  extreme  form  of  the  last  distribution, 
from  which  it  cannot  always  be  distinguished  by  elementary 
methods  if  the  original  data  are  not  available.  If,  for  instance, 
the  frequencies  of  Table  XIL  had  been  given  by  five-year  intervals 
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only,  they  would  have  run  49,479,  23,348,  4,092,  and  so  on, 
thus  suggesting  a  maximum  number  of  deaths  at  the  beginning 
of  life,  i.e.  a  distribution  of  the  present  type.  It  is  only  the 
analysis  of  the  deaths  in  the  earlier  years  of  life  by  one-year 
intervals  which  shows  that  the  frequency  reaches  a  true  maximum 
in  the  fourth  year,  and  therefore  the  distribution  is  of  the 
moderately  asymmetrical  type.    In  practical  cases  no  hard  and 


Fig.  15. — An  ideal  Distribution  of  the  extreme  Asymmetrical  Form. 

fast  line  can  always  be  drawn  between  the  moderately  and 
extremely  asymmetrical  types,  any  more  than  between  the 
moderately  asymmetrical  and  the  symmetrical  type. 

In  economic  statistics  this  form  of  distribution  is  particularly 
characteristic  of  the  distribution  of  wealth  in  the  population  at 
large,  as  illustrated,  e.g.^  by  income  tax  and  house  valuation  returns, 
by  returns  of  the  size  of  agricultural  holdings,  and  so  on  (c/.  ref.  4). 
The  distributions  may  possibly  be  a  very  extreme  case  of  the  last 
type ;  but  if  the  maximum  is  not  absolutely  at  the  lower  end  of  the 
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range,  it  is  very  close  indeed  thereto.  Official  returns  do  not 
usually  give  the  necessary  analysis  of  the  frequencies  at  the 
lower  end  of  the  range  to  enable  the  exact  position  of  the  maximum 
to  be  determined ;  and  for  this  reason  the  data  on  which  Table 
Xni.  is  founded,  though  of  course  very  unreliable,  are  of  some 
interest.  It  will  be  seen  from  the  table  and  fig.  16  that  with  the 
given  classification  the  distribution  appears  clearly  assignable  to 
the  present  type,  the  number  of  estates  between  zero  and  £100 
in  annual  value  being  more  than  six  times  as  great  as  the  number 
between  £100  and  £200  in  annual  value,  and  the  frequency 
continuously  falling  as  the  value  increases.  A  close  analysis  of 
the  first  class  suggests,  however,  that  the  greatest  frequency  does 
not  occur  actually  at  zero,  but  that  there  is  a  true  maximum 
frequency  for  estates  of  about  £1  15  0  in  annual  value.  The 
distribution  might  therefore  be  more  correctly  assigned  to  the 
second  type,  but  the  position  of  the  greatest  frequency  indicates  a 


Table  XIII. — Showing  the  Numbers  and  Annual  Values  of  the  Estates  of 
those  who  had  taken  part  in  the  Jacobite  Rising  of  1715.  (Compiled  from 
Gosin's  Names  of  the  Roman  Catholics^  Nonjurors,  and  others  who  refused 
to  take  the  Oaths  to  his  late  Majesty  King  George,  etc.  ;  London,  1745. 
Figures  of  very  doubtful  absolute  value.  See  a  note  in  Southey's 
Commonplace  Book,  vol.  i.  p.  573,  quoted  from  the  Memoirs  of  T.  Hollis.) 
See  Fig.  16. 


Annual 
Value  in 
£100. 

Number  of 
Estates. 

Annual 
Value  in 
£100. 

Number  of 
Estates. 

0-  1 

1726-5 

17-18 

1 

1-  2 

280 

2-  3 

140-5 

20-21 

4 

3-  4 

87 

21-22 

1 

4-  5 

46-5 

22-23 

1 

5-  6 

42-5 

23-24 

1 

6  7 

29-5 

7-  8 

25-5 

27-28 

2 

8-  9 

18-5 

9-10 

21 

31-32 

1 

10-11 

11-5 

11-12 

9-5 

39-40 

1 

12-13 

4 

13-14 

3-5 

45-46 

1 

14-15 

8 

15-16 

3 

48-49 

1 

16-17 

5 

Total 

2476 
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degree  of  asymmetry  that  is  high  even  compared  with  the 
asymmetry  of  fig.  14  :  the  distribution  of  numbers  of  deaths  from 


3  4  5  6  7  6 
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W 
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Fig.  16.— Frequency-distribution  of  the  Annual  Values  of  certain  Estates 
in  England  in  1715  :  2476  Estates.    (Table  XIII.) 


diphtheria  would  more  closely  resemble  the  distribution  of  estate- 
values  if  the  maximum  occurred  in  the  fourth  and  fifth  weeks 
of  life  instead  of  in  the  fourth  year.  The  figures  of  Tabic  IV., 
p.  83,  showing  the  annual  value  and  number  of  dwelling-houses. 
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afford  a  good  illustration  of  this  form  of  distribution,  but  marred 
by  the  unequal  intervals  so  common  in  official  returns. 


Table  XIV, — Showing  the  Frequencies  of  Different  Numbers  of  Petals  for 
Three  Series  of  Ranunculus  bulbosus.  (H.  de  Vries,  Ber.  dtsch.  hot.  Ges., 
Bd.  xii.,  1894,  q.v.  for  details.)    See  Fig.  17. 


JL  1  cU  LloliO  Y  * 

Number 

of  Petals. 



Series  A. 

Series  B. 

Series  G. 

5 

312 

345 

133 

6 

17 

24 

55 

7 

4 

7 

23 

8 

2 

7 

9 

2 

2 

2 

10 

2 

11 

2 

Total 

337 

380 

222 

The  type  is  not  very  frequent  in  other  classes  of  material,  but 
instances  occur  here  and  there.    Table  XIV.  and  fig.  17  show 
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Fir.  17. — Frequency-distributions  of  Numbers  of  Petals  for  Three  Series  of 
Ranunculus  bulbosus  :  A  337,  B  380,  C  222  observations.    (Table  XIV.) 


distributions  of  this  form  for  the  petals  of  the  buttercup.  Ranun- 
culus bulbosus. 

16.  The  U-shaped  distribution,  exhibiting  a  maximum  frequency 
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at  the  ends  of  the  range  and  a  minimum  towards  the  centre. 
The  ideal  form  of  the  distribution  is  illustrated  by  fig.  18. 


Fig.  18. — An  ideal  Distribution  of  the  U-shaped  Form. 

This  is  a  rare  but  interesting  form  of  distribution,  as  it  stands 
in  somewhat  marked  contrast  to  the  preceding  forms.  Table  XV. 
and  fig.  19  illustrate  an  example  based  on  a  considerable  number 
of  observations,  viz.  the  distribution  of  degrees  of  cloudiness,  or 
estimated  percentage  of  the  sky  covered  by  cloud,  at  Breslau 


Table  XV.  — Showing  the  Frequencies  of  Estimated  Intensities  of  Cloudiness 
ai  Breslau  during  the  Ten  Years  1876-85.    (See  ref.  2.)    See  Fig.  19. 


Cloudiness. 

Frequency. 

Cloudiness. 

Frequency. 

0 

751 

6 

21 

1 

179 

7 

71 

2 

107 

8 

194 

3 

69 

9 

117 

4 

46 

10 

2089 

5 

9 

Total 

3653 
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during  the  years  1876-85.  A  sky  completely,  or  almost  com- 
pletely, overcast  at  the  time  of  observation  is  the  most  common, 
a  practically  clear  sky  comes  next,  and  intermediates  are  more 
rare. 

This  form  of  distribution  appears  to  be  sometimes  exhibited  by 
the  percentages  of  offspring  possessing  a  certain  attribute  when  one 
at  least  of  the  parents  also  possesses  the  attribute.    The  remarks 
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Fio.  19. — Frequency-distribution  of  Degrees  of  Cloudiness  at  Breslau 
1876-85  :  3653  observations.    (Table  XY.) 


of  Sir  Francis  Gal  ton  in  Natural  Inheritance  suggest  such  a 
form  for  the  distribution  of  "  consumptivity "  amongst  the  off- 
spring of  consumptives,  but  the  figures  are  not  in  a  decisive  shape. 
Table  XVI.  gives  the  distribution  for  an  analogous  case,  viz.  the 


Table  XVI. — Showing  the  Percentages  of  Deaf-mutes  among  Children  of 
Parents  one  of  ivhom  at  least  was  a  Deaf-mute,  for  Marriages  producing 
Five  Children  or  more.  (Compiled  from  material  in  Marriages  of  the  Deaf 
in  America,  ed.  E.  A.  Fay,  Volta  Bureau,  Washington,  1898.) 


Percentage 
of 

Deaf-mutes. 

Number  of 
Families. 

Percentoge 
of 

Deaf-mutes, 

Number  of 
Families. 

0-20 

220 

60-80 

5-5 

20-40 

20-5 

80-100 

15 

40-60 

12 

Total 

273 
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distribution  of  deaf-mutism  amongst  the  offspring  of  parents  one 
of  whom  at  least  was  a  deaf  mute.  In  general  less  than  one-fifth 
of  the  children  are  deaf-mutes  :  at  the  other  end  of  the  range  the 
cases  in  which  over  80  per  cent,  of  the  children  are  deaf-mutes  are 
nearly  three  times  as  many  as  those  in  which  the  percentage  lies 
between  60  and  80.  The  numbers  are,  however,  too  small  to  form 
a  very  satisfactory  illustration. 
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The  first  three  memoirs  above  are  mathematical  memoirs  on  the  theory 
of  ideal  frequency-curves,  the  first  being  the  fundamental  memoir,  and 
the  second  and  third  supplementary.  The  elementary  student  may, 
however,  refer  to  them  with  advantage,  on  account  of  the  large  collection 
of  frequency-distributions  which  is  given,  and  from  which  some  of  the 
illustrations  in  the  preceding  chapter  have  been  cited.  Without 
attempting  to  follow  the  mathematics,  he  may  also  note  that  each  of 
our  rough  empirical  types  may  be  divided  into  several  sub-types,  the 
theoretical  division  into  types  being  made  on  different  grounds. 

The  fourth  work  is  cited  on  account  of  the  author's  discussion  of  the  dis- 
tribution of  wealth  in  a  community,  to  which  reference  was  made  in  §  15. 

In  connection  with  the  remarks  in  §  6,  on  the  grouping  of  ages, 
reference  may  be  made  to  the  following  in  which  a  different  conclusion 
is  drawn  as  to  the  best  grouping : — 

(5)  Young,  Allyn  A.,  "A  Discussion  of  Age  Statistics,"  Censits  Bulletin  IS, 

Bureau  of  the  Census,  Washington,  U.S.A.,  1904. 

Reference  should  also  be  made  to  the  Census  of  England  and  Wales, 
1911,  vol.  vii,,  "Ages  and  Condition  as  to  Marriage,"  especially  the 
Report  by  Mr  George  King  on  the  graduation  of  ages. 

EXERCISES. 

1.  If  the  diagram  fig.  6  is  redrawn  to  scales  of  300  observations  per  interval 
to  the  inch  and  4  inches  of  stature  to  the  inch,  what  is  the  scale  of  observa- 
tions to  the  square  inch  ? 

If  the  scales  are  100  observations  per  interval  to  the  centimetre  and  2  inches 
of  stature  to  the  centimetre,  what  is  the  scale  of  observations  to  the 
square  centimetre  ? 

2.  If  fig.  1 0  is  redrawn  to  scales  of  25  observations  per  interval  to  the  inch  and 
2  per  cent,  to  the  inch,  what  is  the  scale  of  observations  to  the  square  inch  ? 

If  the  scales  are  ten  observations  per  interval  to  the  centimetre  and  1  per  cent 
to  the  centimetre,  what  is  the  scale  of  observations  to  the  square  centimetre  ? 

3.  If  a  frequency-polygon  be  drawn  to  represent  the  data  of  Table  I. ,  what 
number  of  observations  will  the  polygon  show  between  death-rates  of  16*5 
and  17*5  per  thousand,  instead  of  the  true  number  159  ? 

4.  If  a  frequency- polygon  be  drawn  to  represent  the  data  of  Table  V., 
what  number  of  observations  will  the  polygon  show  between  head-breadths 
5*95  and  6 '05,  instead  of  the  true  number  236  ? 


CHAPTER  VII. 


AVERAGES. 

1.  Necessity  for  quantitative  definition  of  the  characters  of  a  frequency- 
distribution — 2,  Measures  of  position  (averages)  and  of  dispersion — 3. 
The  dimensions  of  an  average  the  same  as  those  of  the  variable — 4. 
Desirable  properties  for  an  average  to  possess — 5.  The  commoner  forms 
of  average — 6-13.  The  arithmetic  mean  :  its  definition,  calculation,  and 
simpler  properties — 14-18.  The  median  :  its  definition,  calculation,  and 
simpler  properties — 19-20.  The  mode :  its  definition  and  relation  to 
mean  and  median — 21.  Summary  comparison  of  the  preceding  forms 
of  average — 22-26.  The  geometric  mean :  its  definition,  simpler  pro- 
perties, and  the  cases  in  which  it  is  specially  applicable — 27.  The 
harmonic  mean  :  its  definition  and  calculation. 

1 .  In  §  2  of  the  last  chapter  it  was  pointed  out  that  a  classification 
of  the  observations  in  any  long  series  is  the  first  step  necessary 
to  make  the  observations  comprehensible,  and  to  render  possible 
those  comparisons  with  other  series  which  are  essential  for  any 
discussion  of  causation.  Very  little  experience,  however,  would 
show  that  classification  alone  is  not  an  adequate  method,  seeing 
that  it  only  enables  qualitative  or  verbal  comparisons  to  be  made. 
The  next  step  that  it  is  desirable  to  take  is  the  quantitative 
definition  of  the  characters  of  the  frequency-distribution,  so  that 
quantitative  comparisons  may  be  made  between  the  corresponding 
characters  of  two  or  more  series.  It  might  seem  at  first  sight 
that  very  difficult  cases  of  comparison  could  arise  in  which,  for 
example,  we  had  to  contrast  a  symmetrical  distribution  with  a  "  J- 
shaped  "  distribution.  As  a  matter  of  practice,  however,  we  seldom 
have  to  deal  with  such  a  case ;  distributions  drawn  from  similar 
material  are,  in  general,  of  similar  form.  When  we  have  to 
compare  the  frequency-distributions  of  stature  in  two  races  of 
man,  of  the  death-rates  in  English  registration  districts  in  two 
successive  decades,  of  the  numbers  of  petals  in  two  races  of  the 
same  species  of  Ranunculus,  we  have  only  to  compare  with  each 
other  two  distributions  of  the  same  or  nearly  the  same  type. 

2.  Confining  our  attention,  then,  to  this  simple  case,  there  are 
two  fundamental  characteristics  in  which  such  distributions  may 
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differ :  (1)  they  may  differ  markedly  in  position,  i.e.  in  the  values 
of  the  variable  round  which  they  centre,  as  in  fig.  20,  A,  or  (2) 
they  may  centre  round  the  same  value,  but  differ  in  the  range  of 
variation  or  dispersion^  as  it  is  termed,  as  in  fig.  20,  B.  Of  course 
the  distributions  may  differ  in  both  characters  at  once,  as  in  fig  20, 
C,  but  the  two  properties  may  be  considered  independently. 
Measures  of  the  first  character,  position,  are  generally  known  as 
averages ;  measures  of  the  second  are  termed  measures  of  disper- 
sion. In  addition  to  these  two  principal  and  fundamental 
characters,  we  may  also  take  a  third  of  some  interest  but  of  much 
less  importance,  viz.  the  degree  of  asymmetry  of  the  distribution. 


0 


o 


Fig.  20. 

The  present  chapter  deals  only  with  averages;  measures  of 
dispersion  are  considered  in  Chapter  VIII.  and  measures  of 
asymmetry  are  also  briefly  discussed  at  the  end  of  that  chapter. 

3.  In  whatever  way  an  average  is  defined,  it  may  be  as  well  to 
note,  it  is  merely  a  certain  value  of  the  variable,  and  is  therefore 
necessarily  of  the  same  dimensions  as  the  variable :  i.e.  if  the 
variable  be  a  length,  its  average  is  a  length ;  if  the  variable  be  a 
percentage,  its  average  is  a  percentage,  and  so  on.  But  there  are 
several  different  ways  of  approximately  defining  the  position  of  a 
frequency-distribution,  that  is,  there  are  several  different  forms  of 
average,  and  the  question  therefore  arises,  By  what  criteria  are  we 
to  judge  the  relative  merits  of  different  forms  %  What  are,  in  fact, 
the  desirable  properties  for  an  average  to  possess? 
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4.  (a)  In  the  first  place,  it  almost  goes  without  saying  that  an 
average  should  be  rigidly  defined,  and  not  left  to  the  mere  estimation 
of  the  observer.  An  average  that  was  merely  estimated  would 
depend  too  largely  on  the  observer  as  well  as  the  data.  (6)  An 
average  should  be  based  on  all  the  observations  made.  If  not, 
it  is  not  really  a  characteristic  of  the  whole  distribution,  (c)  It 
is  desirable  that  the  average  should  possess  some  simple  and 
obvious  properties  to  render  its  general  nature  readily  compre- 
hensible :  an  average  should  not  be  of  too  abstract  a  mathematical 
character,  (d)  It  is,  of  course,  desirable  that  an  average  should 
be  calculated  with  reasonable  ease  and  rapidity.  Other  things 
being  equal,  the  easier  calculated  is  the  better  of  two  forms  of 
average.  At  the  same  time  too  great  weight  must  not  be  attached 
to  mere  ease  of  calculation,  to  the  neglect  of  other  factors,  (e) 
It  is  desirable  that  the  average  should  be  as  little  affected  as 
may  be  possible  by  what  we  have  termed  fluctuations  of  sampling. 
If  different  samples  be  drawn  from  the  same  material,  however 
carefully  they  may  be  taken,  the  averages  of  the  different  samples 
will  rarely  be  quite  the  same,  but  one  form  of  average  may  show 
much  greater  differences  than  another.  Of  the  two  forms,  the 
more  stable  is  the  better.  The  full  discussion  of  this  condition 
must,  however,  be  postponed  to  a  later  section  of  this  work 
(Chap.  XVII.).  (f)  Finally,  by  far  the  most  important  desideratum 
is  this,  that  the  measure  chosen  shall  lend  itself  readily  to 
algebraical  treatment.  If,  e.g.,  two  or  more  series  of  observations 
on  similar  material  are  given,  the  average  of  the  combined  series 
should  be  readily  expressed  in  terms  of  the  averages  of  the 
component  series :  if  a  variable  may  be  expressed  as  the  sum  of 
two  or  more  others,  the  average  of  the  whole  should  be  readily 
expressed  in  terms  of  the  averages  of  its  parts.  A  measure  for 
which  simple  relations  of  this  kind  cannot  be  readily  determined 
is  likely  to  prove  of  somewhat  limited  application. 

5.  There  are  three  forms  of  average  in  common  use,  the 
arithmetic  mean,  the  median,  and  the  mode,  the  first  named  being 
by  far  the  most  widely  used  in  general  statistical  work.  To 
these  may  be  added  the  geometric  mean  and  the  harmonic  mean, 
more  rarely  used,  but  of  service  in  special  cases.  We  will  con- 
sider these  in  the  order  named. 

6.  The  arithmetic  mean. — The  arithmetic  mean  of  a  series  of 
values  of  a  variable  X^,  X^,  Xg,  .  .  .  X„,  N  in  number,  is  the 
quotient  of  the  sum  of  the  values  by  their  number.  That  is  to 
say,  if  M  be  the  arithmetic  mean. 
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or,  to  express  it  more  briefly  by  using  the  symbol  2  to  denote 
"  the  sum  of  all  quantities  like," 

M=^2(X)       .       .       .       .  (1) 

The  word  mean  or  average  alone,  without  qualification,  is  very 
generally  used  to  denote  this  particular  form  of  average  :  that 
is  to  say,  when  anyone  speaks  of  "the  mean  "  or  "the  average" 
of  a  series  of  observations,  it  may,  as  a  rule,  be  assumed  that  the 
arithmetic  mean  is  meant.  It  is  evident  that  the  arithmetic 
mean  fulfils  the  conditions  laid  down  in  (a)  and  (5)  of  §  4,  for  it 
is  rigidly  defined  and  based  on  all  the  observations  made. 
Further,  it  fulfils  condition  (c),  for  its  general  nature  is  readily 
comprehensible.  If  the  wages-bill  for  N  workmen  is  £P,  the 
arithmetic  mean  wage,  PjN  pounds,  is  the  amount  that  each 
would  receive  if  the  whole  sum  available  were  divided  equally 
between  them  :  conversely,  if  we  are  told  that  the  mean  wage 
is  £M,  we  know  this  means  that  the  wages-bill  is  X.M  pounds. 
Similarly,  if  X  families  possess  a  total  of  C  children,  the  mean 
number  of  children  per  family  is  C.X — the  number  that  each 
family  would  possess  if  the  children  were  shared  uniformly. 
Conversely,  if  the  mean  number  of  children  per  family  is  J/,  the 
total  number  of  children  in  N  families  is  N.M.  The  arithmetic 
mean  expresses,  in  fact,  a  simple  relation  between  the  whole 
and  its  parts. 

7.  As  regards  simplicity  of  calculation,  the  mean  takes  a  high 
position.  In  the  cases  just  cited,  it  will  be  noted  that  the  mean 
is  actually  determined  without  even  the  necessity  of  determining 
or  noting  all  the  individual  values  of  the  variable  :  to  get  the 
mean  wage  we  need  not  know  the  wages  of  every  hand,  but  only 
the  wages-bill ;  to  get  the  mean  number  of  children  per  family 
we  need  not  know  the  number  in  each  family,  but  only  the  total. 
If  this  total  is  not  given,  but  we  have  to  deal  with  a  moderate 
number  of  observations — so  few  (say  30  or  40)  that  it  is  hardly 
worth  while  compiling  the  frequency-distribution — the  arithmetic 
mean  is  calculated  directly  as  suggested  by  the  definition,  i.e. 
all  the  values  observed  are  added  together  and  the  total  divided 
by  the  number  of  observations.  But  if  the  number  of  observations 
be  large,  this  direct  process  becomes  a  little  lengthy.  It  may 
be  shortened  considerably  by  forming  the  frequency-table  and 
treating  all  the  values  in  each  class  as  if  they  were  identical  with 
the  mid-value  of  the  class-interval,  a  process  which  in  general 
gives  an  approximation  that  is  quite  suflficiently  exact  for  prac- 
tical purposes  if  the  class-interval  has  been  taken  moderately 
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small  (cf.  Chap.  VI.  §  5).  In  this  process  each  class-frequency 
is  multiplied  by  the  mid-value  of  the  interval,  the  products  added 
together,  and  the  total  divided  by  the  number  of  observations. 
If /denote  the  frequency  of  any  class,  X  the  mid- value  of  the 
corresponding  class-interval,  the  value  of  the  mean  so  obtained 
may  be  written — 

M^^^if.X)     .      .      .      .  (2) 

8.  But  this  procedure  is  still  further  abbreviated  in  practice 
by  the  following  artifices : — (1)  The  class-interval  is  treated 
as  the  unit  of  measurement  throughout  the  arithmetic ;  (2)  the 
difference  between  the  mean  and  the  mid-value  of  some  arbi- 
trarily chosen  class-interval  is  computed  instead  of  the  absolute 
value  of  the  mean. 

If  A  be  the  arbitrarily  chosen  value  and 

+  ^  (3) 

then 

or,  since  ^  is  a  constant, 

M^A  +  ^%(/.()    .       .       .       .  (4) 

The  calculation  of  %(f.X)  is  therefore  replaced  by  the  calcula- 
tion of  ^(/.|).  The  advantage  of  this  is  that  the  class-frequencies 
need  only  be  multiplied  by  small  integral  numbers ;  for  A 
being  the  mid-value  of  a  class-interval,  and  X  the  mid-value  of 
another,  and  the  class-interval  being  treated  as  a  unit,  the  ^'s 
must  be  a  series  of  integers  proceeding  from  zero  at  the  arbitrary 
origin  A.  To  keep  the  values  of  ^  as  small  as  possible,  A  should 
be  chosen  near  the  middle  of  the  range. 

It  may  be  mentioned  here  that  ^(^),  or  for  the  grouped 

distribution,  is  sometimes  termed  the  first  moment  of  the  distribu- 
tion about  the  arbitrary  origin  A  :  we  shall  not,  however,  make 
use  of  this  term. 

9.  The  process  is  illustrated  by  the  following  example,  using 
the  frequency-distribution  of  Table  VIII.,  Chap.  VI.  The 
arbitrary  origin  A  is  taken  at  3*5  per  cent.,  the  middle  of  the 
sixth  class-interval  from  the  top  of  the  table,  and  a  little  nearer 
than  the  middle  of  the  range  to  the  estimated  position  of  the 
mean.  The  consequent  values  of  |  are  then  written  down  as  in 
column  (3)  of  the  table,  against  the  corresponding  frequencies,  the 
values  starting,  of  course,  from  zero  opposite  3  5  per  cent.  Each 
frequency  /  is  then  multiplied  by  its  $  and  the  products  entered 


1 


VII. — AVERAGES.  Ill 

in  another  column  (4).  The  positive  and  negative  products  are 
totalled  separately,  giving  totals  -  776  and  +  509  respectively, 
whence  =  -  267.    Dividing  this  by  JV,  viz.  632,  we  have 

the  difference  of  M  from  A  in  class-intervals,  viz.  0'4:2  intervals, 
that  is  0*21  per  cent.  Hence  the  mean  is  3*5  -  0*21  =  3'29 
per  cent. 

Calculation  of  the  Mean:  Example  i. — Calculation  of  the  Arithmetic 
Mean  of  the  Percentages  of  the  Population  in  receipt  of  Belief,  from  the 
Figures  of  Table  FIJI.,  Chap.  VI.,  p.  93. 
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3 

100 

-  1 

100 

3-5 

90 

0 

-776 

4 

75 

+  1 

75 

4-5 

60 

+  2 

120 

5 

40 

+  3 

120 

5-5 

21 

+  4 

84 

6 

11 

+  5 

55 

6-5 

5 

+  6 

30 

7 

1 

+  7 

7 

7-5 

1 

+  8 

8 

8 

+  9 

8-5 

1 

+  10 

10 

Total 

632 

+  509 

-+509-  776  =-267 


M-A  =  ~g32  class-intervals  =  -0*42  class-intervals 
=  -0-21  units 
.*.    mean  if=  3 -5 -021=    3*29  percent 

It  must  always  be  remembered  that  %(f.^)IN  gives  the  value  of 
M-A  in  class-intervals,  and  must  not  be  added  directly  to  A 
unless  the  interval  is  also  a  unit.    In  the  present  illustration  the 


I 


112 


THEORY  OF  STATISTICS. 


interval  is  half  a  unit,  and  accordingly  the  quotient  267/632  is 
halved  in  order  to  obtain  an  answer  in  units.  Care  must  also  be 
taken  to  give  the  right  sign  to  the  quotient. 

10.  As  the  process  is  an  important  one  we  give  a  second  illustra- 
tion from  the  figures  of  Table  VI.,  Chap.  VI.  In  this  case  the  class- 
interval  is  a  unit  (1  inch),  so  the  value  of  M- A  is  given  directly 
by  dividing  ^(/.|)  by  N.  The  student  must  notice  that,  measures 
having  been  made  to  the  nearest  eighth  of  an  inch,  the  mid- values 
of  the  intervals  are  57  j^^.,  SSj^g,  etc.,  and  not  57*5,  58  5,  etc. 


Calculation  of  the  Mean:  Example  ii.— Calculation  of  the  Arithmetic 
Mean  Stature  of  Male  Adults  in  tM  British  Isles  from  the  Figures  of 
Chap.  VL,  Table  VI,,  p.  88. 


0) 

(2) 

f3) 

V"/ 

(4) 

Deviation 

Height, 

Frequency 

from  Arbitrary 

Product 

Inches. 

/ 

Value  A 

/I- 

I- 

0/  — 

9 

—  lU 

90 

58- 

4 

-  9 

36 

59- 

14 

-  8 

112 

60- 

41 

-  7 

287 

61- 

83 

-  6 

498 

62- 

169 

-  5 

845 

63- 

394 

-  4 

1576 

64- 

669 

-  3 

2007 

65- 

990 

-  2 

1980 

66- 

1223 

-  1 

1223 

67- 

1329 

0 

-8584 

68- 

1230 

+  1 

1280 

69- 

1063 

+  2 

2126 

70- 

646 

+  3 

1938 

71- 

392 

-I-  4 

1568 

72- 

202 

+  5 

1010 

73- 

79 

-h  6 

474 

74- 

32 

-f-  7 

224 

75- 

16 

+  8 

128 

76- 

5 

+  9 

45 

77- 

2 

+  10 

20 

Total 

8585 

-f-8763 

=  4-  8763  -  8584  =  -I- 179 
179 

M-  A  =  +g5g^=  +  "^2  class-intervals  or  inches. 
.  •.    M=  67 A  +  '02  =  67  -46  inches. 
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It  is  evident  that  an  absolute  check  on  tho  arithmetic  of  any 
such  calculation  may  be  effected  by  taking  a  different  arbitrary 
origin  for  the  deviations  :  all  the  figures  of  col.  (4)  will  be  changed, 
but  the  value  ultimately  obtained  for  the  mean  must  be  the 
same.  The  student  should  note  that  a  classification  by  unequal 
intervals  is,  at  best,  a  hindrance  to  this  simple  form  of  calculation, 
and  the  use  of  an  indefinite  interval  for  the  extremity  of  the 
distribution  renders  the  exact  calculation  of  the  mean  impossible 
{cf.  Chap.  VI.  §  10). 

11.  We  return  again  below  (§  13)  to  the  question  of  the 

Mo  M 


z: 

\ 

\ 

Mo 

\ 

\ 

\ 

0       I       2       3^^    q       6       6      1        8       9  10 
FerccnUi^e  of  tlie  popuLatioro  uv  receipt  of  reUxf . 

Fig.  21,— Showing  the  Arithmetic  Mean  M,  the  Median  Mi^  and  the  Mode  Mo, 
by  verticals  drawn  through  the  corresponding  points  on  the  base,  for  the 
distribution  of  pauperism  of  fig.  10,  p.  92. 

errors  caused  by  the  assumption  that  all  values  within  the  same 
interval  may  be  treated  as  approximately  the  mid- value  of  the 
interval.  It  is  sufficient  to  say  here  that  the  error  is  in  general 
very  small  and  of  uncertain  sign  for  a  distribution  of  the 
symmetrical  or  only  moderately  asymmetrical  type,  provided  of 
course  the  class-interval  is  not  large  (Chap.  VI.  §  5).  In  the  case 
of  the  "  J-shaped  "  or  extremely  asymmetrical  distribution,  how- 
ever, the  error  is  evidently  of  definite  sign,  for  in  all  the  intervals 
the  frequency  is  piled  up  at  the  limit  lying  towards  the  greatest 
frequency,  i.e.  the  lower  end  of  the  range  in  the  case  of  the  illustra- 
tions given  in  Chap.  VI.,  and  is  not  evenly  distributed  over  the 
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interval.  In  distributions  of  such  a  type  the  intervals  must  be 
made  very  small  indeed  to  secure  an  approximately  accurate  value 
for  the  mean.  The  student  should  test  for  himself  the  effect  of 
different  groupings  in  two  or  three  different  cases,  so  as  to  get 
some  idea  of  the  degree  of  inaccuracy  to  be  expected. 

12.  If  a  diagram  has  been  drawn  representing  the  frequency- 
distribution,  the  position  of  the  mean  may  conveniently  be 
indicated  by  a  vertical  through  the  corresponding  point  on  the 
base.  Thus  fig.  21  (a  reproduction  of  fig.  10)  shows  the  frequency- 
polygon  for  our  first  illustration,  and  the  vertical  AIM  indicates 
the  mean.  In  a  moderately  asymmetrical  distribution  at  all  of 
this  form  the  mean  lies,  as  in  the  present  example,  on  the  side  of 
the  greatest  frequency  towards  the  longer  "  tail "  of  the  distribu- 

^^^^^^^ 

Mo  MIM 

Fig.  22. — Mean  Jf,  Median  Mi,  and  Mode  Mo,  of  the  ideal  moderately 
asymmetrical  distribution. 

tion :  M  in  fig.  22  shows  similarly  the  position  of  the  mean  in 
an  ideal  distribution.  In  a  symmetrical  distribution  the  mean 
coincides  with  the  centre  of  symmetry.  The  student  should  mark 
the  position  of  the  mean  in  the  diagram  of  every  frequency  dis- 
tribution that  he  draws,  and  so  accustom  himself  to  thinking  of 
the  mean,  not  as  an  abstraction,  but  always  in  relation  to  the 
frequency-distribution  of  the  variable  concerned. 

13.  The  following  examples  give  important  properties  of  the 
arithmetic  mean,  and  at  the  same  time  illustrate  the  facility  of  its 
algebraic  treatment : — 

(a)  The  sum  of  the  deviations  from  the  mean,  taken  with  their 
proper  signs,  is  zero. 

This  follows  at  once  from  equation  (4) :  for  if  M  and  A  are 
identical,  evidently  must  be  zero. 
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(b)  If  a  series  of  JV  observations  of  a  variable  X  consist  of,  say, 
two  component  series,  tlie  mean  of  the  whole  series  can  be 
readily  expressed  in  terms  of  the  means  of  the  two  components. 
For  if  we  denote  the  values  in  the  first  series  by  and  in  the 
second  series  by  X^, 

2(x)=2(x,)+S(jr,), 

that  is,  if  there  be  iV^j  observations  in  the  first  series  and  iVo  in 
the  second,  and  the  means  of  the  two  series  be  J/j,  M2  respectively, 

N.M=N^.M^  +  N.^.M^        .       .       .  (5) 

For  example,  we  find  from  the  data  of  Table  VI.,  Chap.  VI., 

Mean  stature  of  the  346  men  born  in  Ireland  =  67*78  in. 

741       „       „    Wales  =  66-62  in. 

Hence  the  mean  stature  of  the  1087  men  born  in  the  two  countries 
is  given  by  the  equation — 

1087.i/=  (346  X  67-78)  +  (741  x  66-62). 

That  is,  J/=  66*99  inches.  It  is  evident  that  the  form  of  the 
relation  (5)  is  quite  general :  if  there  are  r  series  of  observations 
JTp  X2  .  .  .  .  Xj.,  the  mean  M  of  the  whole  series  is  related  to 
the  means  M^,  .  .  .  .  i/„  of  the  component  series  by  the 
equation 

N.M^NyM^-\-N^.M2+  ....  +]Sf,.M,  .       .  (6) 

For  the  convenient  checking  of  arithmetic,  it  is  useful  to  note 
that,  if  the  same  arbitrary  origin  A  for  the  deviations  ^  be  taken 
in  each  case,  we  must  have,  denoting  the  component  series  by  the 
subscripts  1,  2,  .  .  .  r  as  before, 

2(/.f)  =  2(/,.f,)  +  2(/2.4)+  +m-ir)      ■  (7) 

The  agreement  of  these  totals  accordingly  checks  the  work. 

As  an  important  corollary  to  the  general  relation  (6),  it  may 
be  noted  that  the  approximate  value  for  the  mean  obtained  from 
any  frequency  distribution  is  the  same  whether  we  assume  (1) 
that  all  the  values  in  any  class  are  identical  with  the  mid-value 
of  the  class-interval,  or  (2)  that  the  mean  of  the  values  in  the 
class  is  identical  with  the  mid-value  of  the  class-interval. 

(c)  The  mean  of  all  the  sums  or  differences  of  corresponding 
observations  in  two  series  (of  equal  numbers  of  observations)  is 
equal  to  the  sum  or  difference  of  the  means  of  the  two  series. 

This  follows  almost  at  once.    For  if 


X  —  X^  i  X^i 
2(X)  =  2(Xi)±2(X,). 


116 


THEORY  OF  STATISTICS. 


That  is,  if  M,  M-^,  M.^  be  the  respective  means, 

M^M^±M^       .       .       .       .  (8) 

Evidently  the  form  of  this  result  is  again  quite  general,  so  that 
if 

X=X^±X^±  ....  ±x„ 

M=M^±M^±  ....  ±M,    .       .       .  (9) 

As  a  useful  illustration  of  equation  (8),  consider  the  case  of 
measm-ements  of  any  kind  that  are  subject  (as  indeed  all 
measures  must  be)  to  greater  or  less  errors.  The  actual  measure- 
ment X  in  any  such  case  is  the  algebraic  sum  of  the  true 
measurement  X^  and  an  error  X^.  The  mean  of  the  actual 
measurements  M  is  therefore  the  sum  of  the  true  mean  J/^,  and 
the  arithmetic  mean  of  the  errors  J/g-  ^"^^7  i^j 

latter  be  zero,  will  the  observed  mean  be  identical  with  the  true 
mean.    Errors  of  grouping  (§11)  are  a  case  in  point. 

14.  The  median. — The  median  may  be  defined  as  the  middle- 
most or  central  value  of  the  variable  when  the  values  are  ranged 
in  order  of  magnitude,  or  as  the  value  such  that  greater  and 
smaller  values  occur  with  equal  frequency.  In  the  case  of  a 
frequency-curve,  the  median  may  be  defined  as  that  value  of  the 
variable  the  vertical  through  which  divides  the  area  of  the  curve 
into  two  equal  parts,  as  the  vertical  through  Mi  in  fig.  22. 

The  median,  like  the  mean,  fulfils  the  conditions  (6)  and  (c) 
of  §  4,  seeing  that  it  is  based  on  all  the  observations  made,  and 
that  it  possesses  the  simple  property  of  being  the  central  or 
middlemost  value,  so  that  its  nature  is  obvious.  But  the  defini- 
tion does  not  necessarily  lead  in  all  cases  to  a  determinate  value. 
If  there  be  an  odd  number  of  different  values  of  X  observed,  say 
271-1-1,  the  (w-{-l)th  in  order  of  magnitude  is  the  only  value 
fulfilling  the  definition.  But  if  there  be  an  even  number,  say 
2n  different  values,  any  value  between  the  ?ith  and  (7i-|-l)th 
fulfils  the  conditions.  In  such  a  case  it  appears  to  be  usual  to 
take  the  mean  of  the  nth  and  (7i-f-l)th  values  as  the  median, 
but  this  is  a  convention  supplementary  to  the  definition.  It 
should  also  be  noted  that  in  the  case  of  a  discontinuous  variable 
the  second  form  of  the  definition  in  general  breaks  down :  if  we 
range  the  values  in  order  there  is  always  a  middlemost  value 
(provided  the  number  of  observations  be  odd),  but  there  is  not,  as  a 
rule,  any  value  such  that  greater  and  less  values  occur  with  equal 
frequency.  Thus  in  Table  III.,  §  3  of  Chap.  VI.,  we  see  that  45  per 
cent,  of  the  poppy  capsules  had  12  or  fewer  stigmatic  rays,  55 
per  cent,  had  13  or  more;  similarly  61  per  cent,  had  13  or  fewer 
rays,  39  per  cent,  had  14  or  more.    There  is  no  number  of  rays 
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such  that  the  frequencies  in  excess  and  defect  are  equal. 
In  the  case  of  the  buttercups  of  Table  XIV.  (Chap.  VL  §  15) 
there  is  no  number  of  petals  that  even  remotely  fulfils  the 
required  condition.  An  analogous  difficulty  may  arise,  it  may 
be  remarked,  even  in  the  case  of  an  odd  number  of  observations 
of  a  continuous  variable  if  the  number  of  observations  be  small 
and  several  of  the  observed  values  identical.  The  median  is 
therefore  a  form  of  average  of  most  uncertain  meaning  in  cases 
of  strictly  discontinuous  variation,  for  it  may  be  exceeded  by 
5,  10,  15,  or  20  per  cent,  only  of  the  observed  values,  instead  of 
by  50  per  cent. :  its  use  in  such  cases  is  to  be  deprecated,  and 
is  perhaps  best  avoided  in  any  case,  whether  the  variation  be 
continuous  or  discontinuous,  in  which  small  series  of  observations 
have  to  be  dealt  with. 

15.  When  a  table  showing  the  frequency -distribution  for  a 
long  series  of  observations  of  a  continuous  variable  is  given,  no 
difficulty  arises,  as  a  sufficiently  approximate  value  of  the  median 
can  be  readily  determined  by  simple  interpolation  on  the  hypo- 
thesis that  the  values  in  each  class  are  uniformly  distributed 
throughout  the  interval.  Thus,  taking  the  figures  in  our  first 
illustration  of  the  method  of  calculating  the  mean,  the  total 
number  of  observations  (registration  districts)  is  632,  of  which 
the  half  is  316.  Looking  down  the  table,  we  see  that  there  are 
227  districts  with  not  more  than  2 "75  per  cent,  of  the  population 
in  receipt  of  relief,  and  100  more  with  between  2'75  and  3*25 
per  cent.  But  only  89  are  required  to  make  up  the  total  of  316  ; 
hence  the  value  of  the  median  is  taken  as 

2-75-f  ^  .  1  =  2-75 -f  0-445 

=  3"195  per  cent. 

The  mean  being  3*29,  the  median  is  slightly  less ;  its  position 
is  indicated  by  Mi  in  fig.  21. 

The  value  of  the  median  stature  of  males  may  be  similarly 
calculated  from  the  data  of  the  second  illustration.  The  work 
may  be  indicated  thus  : — 

Half  the  total  number  of  observations  (8585)  =  4292*5 
Total  frequency  under  66i|  inches  .       .     =  3589 

Difference  -  703*5 

Frequency  in  next  interval       .       .       .  =1329 

Therefore  median  =  66^ J  -f  -^^t^ 

=  67-47  inches. 
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The  difference  between  median  and  mean  in  this  case  is 
therefore  only  about  one-hundredth  of  an  inch,  the  smallness 
of  the  difference  arising  from  the  approximate  symmetry  of 
the  distribution.  In  an  absolutely  symmetrical  distribution 
it  is  evident  that  mean  and  median  must  coincide. 

16.  Graphical  interpolation  may,  if  desired,  be  substituted 
for  arithmetical  interpolation.  Taking,  again,  the  figures  of 
Example  i.,  the  number  of  districts  with  pauperism  not  exceeding 
2-25  is  138;  not  exceeding  275,  227  ;  not  exceeding  3-25,  327  ; 
and  not  exceeding  3  75,  417.  Plot  the  numbers  of  districts 
with  pauperism  not  exceeding  each  value  X  to  the  corresponding 
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FiQ.  23. — Determination  of  the  median  by  graphical  interpolation. 

value  of  X  on  squared  paper,  to  a  good  large  scale,  as  in  fig.  23, 
and  draw  a  smooth  curve  through  the  points  thus  obtained, 
preferably  with  the  aid  of  one  of  the  '^curves,"  splines,  or  flexible 
curves  sold  by  instrument-makers  for  the  purpose.  The  point 
in  which  the  smooth  curve  so  obtained  cuts  the  horizontal  line 
corresponding  to  a  total  frequency  N/2  =  316  gives  the  median. 
In  general  the  curve  is  so  flat  that  the  value  obtained  by  this 
graphical  method  does  not  differ  appreciably  from  that  calculated 
arithmetically  (the  arithmetical  process  assuming  that  the 
curve  is  a  straight  line  between  the  points  on  either  side  of 
the  median) ;  if  the  curvature  is  considerable,  the  graphical 
value — assuming,  of  course,  careful  and  accurate  draughtsmanship 
— is  to  be  preferred  to  the  arithmetical  value,  as  it  does  not 
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involve  the  crude  assumption  that  the  frequency  is  uniformly 
distributed  over  the  interval  in  which  the  median  lies. 

17.  A  comparison  of  the  calculations  for  the  mean  and 
for  the  median  respectively  will  show  that  on  the  score  of 
brevity  of  calculation  the  median  has  a  distinct  advantage. 
When,  however,  the  ease  of  algebraical  treatment  of  the  two 
forms  of  average  is  compared,  the  superiority  lies  wholly  on 
the  side  of  the  mean.  As  was  shown  in  §  13,  when  several  series 
of  observations  are  combined  into  a  single  series,  the  mean  of 
the  resultant  distribution  can  be  simply  expressed  in  terms 
of  the  means  of  the  components.  The  expression  of  the 
median  of  the  resultant  distribution  in  terms  of  the  medians 
of  the  components  is,  however,  not  merely  complex  and  difficult, 
but  impossible :  the  value  of  the  resultant  median  depends  on 
the  forms  of  the  component  distributions,  and  not  on  their 
medians  alone.  If  two  symmetrical  distributions  of  the  same 
form  and  with  the  same  numbers  of  observations,  but  with 
different  medians,  be  combined,  the  resultant  median  must 
evidently  (from  symmetry)  coincide  with  the  resultant  mean,  i.e. 
lie  halfway  between  the  means  of  the  components.  But  if  the 
two  components  be  asymmetrical,  or  (whatever  their  form) 
if  the  degrees  of  dispersion  or  numbers  of  observations  in  the 
two  series  be  different,  the  resultant  median  will  not  coincide 
with  the  resultant  mean,  nor  with  any  other  simply  assignable 
value.  It  is  impossible,  therefore,  to  give  any  theorem  for 
medians  analogous  to  equations  (5)  and  (6)  for  means.  It  is 
equally  impossible  to  give  any  theorem  analogous  to  equations 
(8)  and  (9)  of  §  13.  The  median  of  the  sum  or  difference  of 
pairs  of  corresponding  observations  in  two  series  is  not, 
in  general,  equal  to  the  sum  or  difference  of  the  medians  of 
the  two  series ;  the  median  value  of  a  measurement  subject  to 
error  is  not  necessarily  identical  with  the  true  median,  even 
if  the  median  error  be  zero,  i.e.  if  positive  and  negative  errors 
be  equally  frequent. 

18.  These  limitations  render  the  applications  of  the  median  in 
any  work  in  which  theoretical  considerations  are  necessary  com- 
paratively circumscribed.  On  the  other  hand,  the  median  may 
have  an  advantage  over  the  mean  for  special  reasons,  (a)  It  is 
very  readily  calculated  ;  a  factor  to  which,  however,  as  already 
stated,  too  much  weight  ought  not  to  be  attached.  (^>)  It  is 
readily  obtained,  without  the  necessity  of  measuring  all  the 
objects  to  be  observed,  in  any  case  in  which  they  can  be  arranged 
by  eye  in  order  of  magnitude.  If,  for  instance,  a  number  of  men 
be  ranked  in  order  of  stature,  the  stature  of  the  middlemost  is 
the  median,  and  he  alone  need  be  measured.    (On  the  other  hand 
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it  is  useless  in  the  cases  cited  at  the  end  of  §  6  ;  the  median  wage 
cannot  be  found  from  the  total  of  the  wages-bill,  and  the  total 
of  the  wages-bill  is  not  known  when  the  median  is  given.)  (c)  It 
is  sometimes  useful  as  a  makeshift,  when  the  observations  are  so 
given  that  the  calculation  of  the  mean  is  impossible,  owing,  e.g.,  to 
a  final  indefinite  class,  as  in  Table  IV.  (Chap.  VI.  §  10).  {d)  The 
median  may  sometimes  be  preferable  to  the  mean,  owing  to  its 
being  less  affected  by  abnormally  large  or  small  values  of  the 
variable.  The  stature  of  a  giant  would  have  no  more  influence 
on  the  median  stature  of  a  number  of  men  than  the  stature  of 
any  other  man  whose  height  is  only  just  greater  than  the  median. 
If  a  number  of  men  enjoy  incomes  closely  clustering  round  a 
median  of  £500  a  year,  the  median  will  be  no  more  affected  by 
the  addition  to  the  group  of  a  man  with  the  income  of  ^650,000 
than  by  the  addition  of  a  man  with  an  income  of  .£5000,  or  even 
£600.  If  observations  of  any  kind  are  liable  to  present  occasional 
greatly  outlying  values  of  this  sort  (whether  real,  or  due  to 
errors  or  blunders),  the  median  will  be  more  stable  and  less 
affected  by  fluctuations  of  sampling  than  the  arithmetic  mean. 
(In  general  the  mean  is  the  less  affected.)  The  point  is  discussed 
more  fully  later  (Chap.  XVII.).  (e)  It  may  be  added  that  the 
median  is,  in  a  certain  sense,  a  particularly  real  and  natural 
form  of  average,  for  the  object  or  individual  that  is  the  median 
object  or  individual  on  any  one  system  of  measuring  the  character 
with  which  we  are  concerned  will  remain  the  median  on  any 
other  method  of  measurement  which  leaves  the  objects  in  the 
same  relative  order.  Thus  a  batch  of  eggs  representing  eggs 
of  the  median  price,  when  prices  are  reckoned  at  so  much  per 
dozen,  will  remain  a  batch  representing  the  median  price  when 
prices  are  reckoned  at  so  many  eggs  to  the  shilling. 

19.  The  Mode. — The  mode  is  the  value  of  the  variable  corre- 
sponding to  the  maximum  of  the  ideal  frequency-curve  which 
gives  the  closest  possible  fit  to  the  actual  distribution. 

It  is  evident  that  in  an  ideal  symmetrical  distribution  mean, 
median  and  mode  coincide  with  the  centre  of  symmetry.  If, 
however,  the  distribution  be  asymmetrical,  as  in  fig.  22,  the  three 
forms  of  average  are  distinct.  Mo  being  the  mode.  Mi  the  median, 
and  M  the  mean.  Clearly,  the  mode  is  an  important  form  of 
average  in  the  cases  of  skew  distributions,  though  the  term  is  of 
recent  introduction  (Pearson,  ref.  II).  It  represents  the  value 
which  is  most  frequent  or  typical,  the  value  which  is  in  fact  the 
fashion  {la  mode).  But  a  difficulty  at  once  arises  on  attempting 
to  determine  this  value  for  such  distributions  as  occur  in  practice. 
It  is  no  use  giving  merely  the  mid-value  of  the  class-interval  into 
which  the  greatest  frequency  falls,  for  this  is  entirely  dependent 
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on  the  choice  of  the  scale  of  class-intervals.  It  is  no  use  making 
the  class-intervals  very  small  to  avoid  error  on  that  account,  for 
the  class-frequencies  will  then  become  small  and  the  distribution 
irregular.  What  we  want  to  arrive  at  is  the  mid-value  of  the 
interval  for  which  the  frequency  would  be  a  maximum,  if  the 
intervals  could  be  made  indefinitely  small  and  at  the  same  time 
the  number  of  observations  be  so  increased  that  the  class-frequen- 
cies should  run  smoothly.  As  the  observations  cannot,  in  a 
practical  case,  be  indefinitely  increased,  it  is  evident  that  some 
process  of  smoothing  out  the  irregularities  that  occur  in  the 
actual  distribution  must  be  adopted,  in  order  to  ascertain  the 
approximate  value  of  the  mode.  But  there  is  only  one  smoothing 
process  that  is  really  satisfactory,  in  so  far  as  every  observation 
can  be  taken  into  account  in  the  determination,  and  that  is  the 
method  of  fitting  an  ideal  frequency-curve  of  given  equation  to 
the  actual  figures.  The  value  of  the  variable  corresponding  to  the 
maximum  of  the  fitted  curve  is  then  taken  as  the  mode,  in 
accordance  with  our  definition.  Mo  in  fig.  21  is  the  value  of  the 
mode  so  determined  for  the  distribution  of  pauperism,  the  value 
2  99  being,  as  it  happens,  very  nearly  coincident  with  the  centre 
of  the  interval  in  which  the  greatest  frequency  lies.  The  deter- 
mination of  the  mode  by  this — the  only  strictly  satisfactory — 
method  must,  how^ever,  be  left  to  the  more  advanced  student. 

20.  At  the  same  time  there  is  an  approximate  relation  between 
mean,  median,  and  mode  that  appears  to  hold  good  with  surprising 
closeness  for  moderately  asymmetrical  distributions,  approaching 
the  ideal  type  of  fig.  9,  and  it  is  one  that  should  be  borne  in 
mind  as  giving — roughly,  at  all  events — the  relative  values  of 
these  three  averages  for  a  great  many  cases  with  which  the 
student  will  have  to  deal.    It  is  expressed  by  the  equation — 

Mode  =  Mean  -  3 (Mean  -  Median). 

That  is  to  say,  the  median  lies  one-third  of  the  distance  from  the 
mean  towards  the  mode  (compare  figs.  21  and  22).  For  the  dis- 
tribution of  pauperism  we  have,  taking  the  mean  to  three  places  of 
decimals, — 

Mean  3  289 

Median        .       .       .       .  3-195 

Difference    ,       .       .       .  0-094 

Hence  approximate  mode  =  3*289  -  3  x  0  094 
=  3-007, 

or  3-01  to  the  second  place  of  decimals,  which  is  sufficient  accuracy 
for  the  final  result,  though  three  decimal  places  must  be  retained 
for  the  calculation.    The  true  mode,  found  by  fitting  an  ideal 
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distribution,  is  2 •99.  As  further  illustrations  of  the  closeness 
with  which,  the  relation  may  be  expected  to  hold  in  different  cases, 
we  give  below  the  results  for  the  distributions  of  pauperism  in 
the  unions  of  England  and  Wales  in  the  years  1850,  1860,  1870, 
1881,  and  1891  (the  last  being  the  illustration  taken  above), 
and  also  the  results  for  the  distribution  of  barometer  heights  at 
Southampton  (Table  XL,  Chap.  VI.  §  14),  and  similar  distribu- 
tions at  four  other  stations. 

Comparison  of  the  Approximate  and  True  Modes  in  the  Case  of  Five  Dis- 
tributions of  Pauperism  {Percentages  of  the  Population  in  receipt  of 
Relief)  in  the  Unions  of  England  and  Wales.  (Yule,  Jour.  Roy.  Stat. 
Soc,  vol.  lix.,  1896.) 


Year. 

Mean. 

Median. 

Approximate 
Mode. 

True  Mode. 

1850 

6-508 

6-261 

5-767 

5-815 

1860 

5-195 

5-000 

4610 

4-657 

1870 

5-451 

5-380 

5-238 

5-038 

1881 

3-676 

3-523 

3-217 

3-240 

1891 

3-289 

3-195 

3-007 

2-987 

Comparison  of  the  Approximate  and  True  Modes  in  tJie  Case  of  Five  Dis- 
tributions of  the  Height  of  the  Barometer  for  Daily  Observations  at  the 
Stations  named.  (Distributions  given  by  Karl  Pearson  and  Alice  Lee, 
Phil.  Trans.,  A,  vol.  cxc.  (1897),  p.  423.) 


Station. 

Mean. 

Median. 

Approximate 
Mode. 

True  Mode. 

Southampton  . 
Londonderry  . 
Carmarthen 
Glasgow  . 
Dundee  . 

29-981 
29-891 
29-952 
29-886 
29-870 

30-000 

29-915 
29-974 
29-906 
29-890 

30-038 

29-  963 

30-  018 
29-946 
29-930 

30-039 

29-  960 

30-  013 
29-967 
29  951 

It  will  be  seen  that  in  the  case  of  the  pauperism  figures  the 
approximate  mode  only  diverges  markedly  from  the  true  value 
in  the  year  1870,  a  year  in  which  the  frequency-distribution  was 
very  irregular.  In  all  the  other  years  the  difference  between  the 
true  and  approximate  values  of  the  mode  is  hardly  greater  than 
the  alteration  that  might  be  caused  in  the  true  mode  itself  by 
slight  variations  in  the  method  of  fitting  the  curve  to  the  actual 
distribution.  Similar  remarks  apply  to  the  second  series  of  illus- 
trations ;  the  true  and  approximate  values  are  extremely  close, 
except  in  the  case  of  Dundee  and  Glasgow,  where  the  divergence 
reaches  two-hundredths  of  an  inch. 

21.  Summing  up  the  preceding  paragraphs,  we  may  say  that 
the  mean  is  the  form  of  average  to  use  for  all  general  purposes ; 
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it  is  simply  calculated,  its  value  is  always  determinate,  its 
algebraic  treatment  is  particularly  easy,  and  in  most  cases  it  is 
rather  less  affected  than  the  median  by  errors  of  sampling.  The 
median  is,  it  is  true,  somewhat  more  easily  calculated  from  a  given 
frequency-distribution  than  is  the  mean ;  it  is  sometimes  a  useful 
makeshift,  and  in  a  certain  class  of  cases  it  is  more  and  not  less 
stable  than  the  mean  ;  but  its  use  is  undesirable  in  cases  of  discon- 
tinuous variation,  its  value  may  be  indeterminate,  and  its  algebraic 
treatment  is  difficult  and  often  impossible.  The  mode,  finally, 
is  a  form  of  average  hardly  suitable  for  elementary  use,  owing 
to  the  difficulty  of  its  determination,  but  at  the  same  time  it 
represents  an  important  value  of  the  variable.  The  arithmetic 
mean  should  invariably  be  employed  unless  there  is  some  very 
definite  reason  for  the  choice  of  another  form  of  average,  and  the 
elementary  student  will  do  very  well  if  he  limits  himself  to  its 
use.  Objection  is  sometimes  taken  to  the  use  of  the  mean  in  the 
case  of  asymmetrical  frequency-distributions,  on  the  ground  that 
the  mean  is  not  the  mode,  and  that  its  value  is  consequently 
misleading.  But  no  one  in  the  least  degree  familiar  with  the 
manifold  forms  taken  by  frequency-distributions  would  regard  the 
two  as  in  general  identical ;  and  while  the  importance  of  the  mode 
is  a  good  reason  for  stating  its  value  in  addition  to  that  of  the 
mean,  it  cannot  replace  the  latter.  The  objection,  it  may  be  noted, 
would  apply  with  almost  equal  force  to  the  median,  for,  as  we  have 
seen  (§  20),  the  difference  between  mode  and  median  is  usually 
about  two-thirds  of  the  difference  between  mode  and  mean. 

22.  The  Geometric  Mean. — The  geometric  mean  6^  of  a  series  of 
values  JTi,  Xg,        ....  X„,  is  defined  by  the  relation 

G^{X,.X,.X,  xSn  .       .      .  (10) 

The  definition  may  also  be  expressed  in  terms  of  logarithms, 

log(?  =  -l2(logX)       .      .       .  (11) 

that  is  to  say,  the  logarithm  of  the  geometric  mean  of  a  series  of 
values  is  the  arithmetic  mean  of  their  logarithms. 

The  geometric  mean  of  a  given  series  of  quantities  is  always 
less  than  their  arithmetic  mean ;  the  student  will  find  a  proof  in 
most  text-books  of  algebra,  and  in  ref.  10.  The  magnitude  of 
the  difference  depends  largely  on  the  amount  of  dispersion  of  the 
variable  in  proportion  to  the  magnitude  of  the  mean  (c/.  Chap. 
VIII.,  Question  8).  It  is  necessarily  zero,  it  should  be  noticed,  if 
even  a  single  value  of  X  is  zero,  and  it  may  become  imaginary  if 
negative  values  occur.    Excluding  these  cases,  the  value  of  the 
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geometric  mean  is  always  determinate  and  is  rigidly  defined.  The 
computation  is  a  little  long,  owing  to  the  necessity  of  taking  ; 
logarithms  :  it  is  hardly  necessary  to  give  an  example,  as  the 
method  is  simply  that  of  finding  the  arithmetic  mean  of  the  i 
logarithms  of  X  (instead  of  the  values  of  X)  in  accordance  with 
equation  (11).    If  there  are  many  observations,  a  table  should  be  • 
drawn  up  giving  the  frequency-distribution  of  log  X,  and  the  [ 
mean  should  be  calculated  as  in  Examples  i.  and  ii.  of  §§  9  and  10.  , 
The  geometric  mean  has  never  come  into  general  use  as  a  repre- 
sentative average,  partly,  no  doubt,  on  account  of  its  rather 
troublesome  computation,  but  principally  on  accoimt  of  its  some- 
what abstract  mathematical  character  (cf.  §  4  (c)  ) :  the  geometric  ; 
mean  does  not  possess  any  simple  and  obvious  properties  which  .; 
render  its  general  nature  readily  comprehensible. 

23.  At  the  same  time,  as  the  following  examples  show,  the 
mean  possesses  some  important  properties,  and  is  readily  treated 
algebraically  in  certain  cases.  j 

(a)  If  the  series  of  observations  X  consist  of  r  component  j 
series,  there  being  iVj  observations  in  the  first,  iVo  in  the  second,  ! 
and  so  on,  the  geometric  mean  G  of  the  whole  series  can  be 
readily  expressed  in  terms  of  the  geometric  means  G-^,  G^,  etc.,  of  i 
the  component  series.    For  evidently  we  have  at  once  (as  in  §  13 

W)~  j 

liAogG^^J^yAocf  G^  +  W^.logG^+  ....  +ir,..log(?,  .    (12)  | 

(b)  The  geometric  mean  of  the  ratios  of  corresponding  observa-  I 
tions  in  two  series  is  equal  to  the  ratio  of  their  geometric  means. 

For  if 

X=XjXc2, 

logX^^log  Xj-logX^,  j 
then  summing  for  all  pairs  of  X^'s  and  X^s,  \ 

G=GJG,      ....    (13)  ; 

(c)  Similarly,  if  a  variable  X  is  given  as  the  product  of  any  | 
number  of  others,  i.e.  if  ' 

X  =  X^.X2.X^  ....  Xf.  ( 

Xj,    X.2,  .  .  .  .  X^  denoting  corresponding   observations  in  r  i 
different  series,  the  geometric  mean  G  of  X  is  expressed  in  terms 
of  the  geometric  means  G-^,  G^,  .  .  ,  .  Gr  of  X^,  X^,  ....  X„  by 
the  relation 

G^GyG^.G^  .  .  .  .  G,     .       .       .  (U) 

That  is  to  say,  the  geometric  mean  of  the  product  is  the  product 
of  the  geometric  means. 
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24.  The  use  of  the  geometric  mean  finds  its  simplest  application 
in  estimating  the  numbers  of  a  population  midway  between  two 
epochs  (say  two  census  years)  at  which  the  population  is  known. 
If  nothing  is  known  concerning  the  increase  of  the  population 
save  that  the  numbers  recorded  at  the  first  census  were      and  at 


the  second  census  n  years  later  P 

180J     n      21     31  41 

300 


250 
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ISO 


CumberlarvA 
Dorset' 
iOO  - 

Hereforci 


SO- 


J,  the  most  reasonable  assump- 

SJ     61      7J      81     91  190J 
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1807    n      21     31      41      51     61      11      81     91  1901 
Census  year. 

Fig.  24.— Showing  the  Populations  of  certain  rural  counties  of  England 
for  each  Census  year  from  1801  to  1901. 

tion  to  make  is  that  the  percentage  increase  in  each  year  has 
been  the  same,  so  that  the  populations  in  successive  years  form  a 
geometric  series,  P^r  being  the  population  a  year  after  the  first 
census,  P^/^  two  years  after  the  first  census,  and  so  on,  and 

K-P.-r'       ....  (15) 
The  population  midway  between  the  two  censuses  is  therefore 


(16) 
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i.e.  the  geometric  mean  of  the  numbers  given  by  the  two  censuses. 
This  result  must,  however,  be  used  with  discretion.  The  rate  of 
increase  of  population  is  not  necessarily,  or  even  usually,  constant 
over  any  considerable  period  of  time  :  if  it  were  so,  a  curve 
representing  the  growth  of  population -as  in  fig.  24  would  be 
continuously  convex  to  the  base,  whether  the  population  were 
increasing  or  decreasing.  In  the  diagram  it  will  be  seen  that 
the  curves  are  frequently  concave  towards  the  base,  and  similar 
results  will  often  be  found  for  districts  in  which  the  population  is 
not  increasing  very  rapidly,  and  from  which  there  is  much 
emigration.  Further,  the  assumption  is  not  self-consistent  in  any 
case  in  which  the  rate  of  increase  is  not  uniform  over  the  entire 
area — and  almost  any  area  can  be  analysed  into  parts  which  are  not 
similar  in  this  respect.  For  if  in  one  part  of  the  area  considered 
the  initial  population  is  Pq  and  the  common  ratio  B,  and  in  the 
remainder  of  the  area  the  initial  population  is  and  the  common 
ratio  r,  the  population  in  year  n  is  given  by 

This  does  not  represent  a  constant  rate  of  increase  unless  E  =  r. 
If  then,  for  example,  a  constant  percentage  rate  of  increase  be 
assumed  for  England  and  Wales  as  a  whole,  it  cannot  be  assumed 
for  the  Counties  :  if  it  be  assumed  for  the  Counties,  it  cannot  be 
assumed  for  the  country  as  a  whole.  The  student  is  referred  to 
refs.  14,  15  for  a  discussion  of  methods  that  may  be  used  for  the 
consistent  estimation  of  populations  under  such  circumstances. 

25.  The  property  of  the  geometric  mean  illustrated  by  equation 
(13)  renders  it,  in  some  respects,  a  peculiarly  convenient  form  of 
average  in  dealing  with  ratios,  i.e.  "index-numbers,"  as  they  are 
termed,  of  prices.  Let 

Y'      IT"      y"  Y"" 
^  0'  ^  0'  ^   0'  •  •  •  •  ^  0 

1)  ^  U  ^    U    •   •   •   •   ^  1 

X  2,     2,  x '  2,  .  .  .  .  -^"2 

denote  the  prices  of  W  commodities  in  the  years  0,  1,  2  .  .  .  . 
Further,  let  Y^q=--  XJX^,  and  so  on,  so  that 

V  V"  V" 

^  10'  ^   10'         10'   •   •   •   •    ^  10 

V  V"  V" 

20'  ^   20'         20'  20 

represent  the  ratios  of  the  prices  of  the  several  commodities  in  years 
i,  ^,  .  .  .  to  their  prices  in  year  0.  These  ratios,  in  practice 
multiplied  by  100,  are  termed  index-numbers  of  the  prices  of  the 
several  commodities,  on  the  year  0  as  base.     Evidently  some 
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form  of  average  of  the  Y's  for  any  given  year  will  afford  an 
indication  of  the  general  level  of  prices  for  that  year,  provided  the 
commodities  chosen  are  sufficiently  numerous  and  representative. 
The  question  is,  what  form  of  average  to  choose.  If  the  geometric 
mean  be  chosen,  and  G^j^,  G^q  denote  the  geometric  means  of  the 
F's  for  the  years  1  and  2  respectively,  we  have 


From  the  first  form  of  this  equation  we  see  that  the  ratio  of  the 
geometric  mean  index-number  in  year  2  to  that  in  year  1  is 
identical  with  the  geometric  mean  of  the  ratios  for  the  index- 
numbers  of  the  several  commodities.  A  similar  property  does 
not  hold  for  any  other  form  of  average  :  the  ratio  of  the  arithmetic 
mean  index-numbers  is  not  the  same  as  the  arithmetic  mean  of 
the  ratios,  nor  is  the  ratio  of  the  medians  the  median  of  the 
ratios.  From  the  second  and  third  forms  of  the  equation  it 
appears  further  that  the  ratio  of  the  geometric  mean  index- 
number  in  year  2  to  that  in  year  1  is  independent  of  the  prices  in 
the  year  first  chosen  as  base  (i.e.  year  0),  and  is  identical  with  the 
geometric  mean  of  the  index-numbers  for  year  ^,  on  year  1  as 
base.  Again,  a  similar  property  does  not  hold  for  any  other  form 
of  average.  If  arithmetic  means  of  the  index-numbers  be  taken, 
for  example,  the  ratio  of  the  mean  in  year  2  to  the  mean  in  year 
1  will  vary  with  the  year  taken  as  base,  and  will  differ  more  or 
less  from  the  arithmetic  mean  ratio  of  the  prices  in  year  2  to  the 
prices  of  the  same  commodities  in  year  1  ;  the  same  statement  is 
true  if  medians  be  used.  The  results  given  by  the  use  of  the 
geometric  mean  possess,  therefore,  a  certain  consistency  that  is 
not  exhibited  if  other  forms  of  average  are  employed.  It  was 
used  in  a  classical  paper  by  Jevons  (ref.  4),  though  not  on  quite 
the  same  grounds,  but  has  never  been  at  all  generally  employed. 

26.  The  general  use  of  the  geometric  mean  has  been  suggested 
on  another  ground,  namely,  that  the  magnitudes  of  deviations 
appear,  as  a  rule,  to  be  dependent  in  some  degree  on  the  magni- 
tude of  the  average ;  thus  the  length  of  a  mouse  varies  less  than 
the  stature  of  a  man,  and  the  height  of  a  shrub  less  than  that  of 
a  tree.  Hence,  it  is  argued,  variations  in  such  cases  should  be 
measured  rather  by  their  ratio  to,  than  their  difference  from,  the 
average ;  and  if  this  is  done,  the  geometric  mean  is  the  natural 
average  to  use.     If  deviations  be   measured  in  this  way,  a 
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deviation  Gfr  will  be  regarded  as  the  equivalent  of  a  deviation  r.G^  ' 
instead  of  a  deviation  -  a;  as  the  equivalent  of  a  deviation  +  x.  j 
If  a  distribution  take  the  simplest  possible  form  when  relative  \ 
deviations  are  regarded  as  equivalents,  the  frequency  of  deviations  \ 
between  Gjs  and  Gjr  will  be  equal  to  the  frequency  of  deviations 
between  r.G  and  s.G.    The  frequency-curve  will  then  be  sym- 
metrical  round  log  G  if  plotted  to  log  X  as  base,  and  if  there  be 
a  single  mode,  log  G  will  be  that  mode — a  logarithmic  or  geometric 
mode,  as  it  might  be  termed  :  G  will  not  be  the  mode  if  the  distri- 
bution be  plotted  in  the  ordinary  way  to  values  of  X  as  base.  i 
The  theory  of  such  a  distribution  has  been  discussed  by  more  than 
one  author  (refs.  2,  8,  9).    The  general  applicability  of  the  assump- 
tion made  does  not,  however,  appear  to  have  been  very  widely  I 
tested,  and  the  reasons  assigned  have  not  sufficed  to  bring  the  • 
geometric  mean  into  common  use.    It  may  be  noted  that,  as  the  ] 
geometric  mean  is  always  less  than  the  arithmetic  mean,  the 
fundamental  assumption  which  would  justify  the  use  of  the  former 
clearly  does  not  hold  where  the  (arithmetic)  mode  is  greater  than 
the  arithmetic  mean,  as  in  Tables  X.  and  XI.  of  the  last  chapter. 

27.  The  Harmonic  Mean. — The  harmonic  mean  of  a  series  of 
quantities  is  the  reciprocal  of  the  arithmetic  mean  of  their 
reciprocals,  that  is,  if  H  be  the  harmonic  mean, 

i4<i)  •    •    •    •  I 

The  following  illustration,  the  result  of  which  is  required  for  an 

example  in  a  later  chapter  (Chap.  XIII.  §  11),  will  serve  to  show  ; 

the  method  of  calculation.  i 

The  table  gives  the  number  of  litters  of  mice,  in  certain  ; 
breeding  experiments,  with  given  numbers  (X)  in  the  litter,  (Data 

from  A.  D.  Darbishire,  Biometrika,  iii.  pp.  30,  31.)  I 


Number  in 

Number  of 

Litter, 

Litters. 

fix. 

X. 

/. 

1 

7 

7-000 

2 

11 

5-500 

3 

16 

5-333 

4 

17 

4-250 

5 

26 

5-200 

6 

31 

5-167 

7 

11 

1-571 

8 

1 

0-125 

9 

1 

0-111 

121 

34-257 

I 


VII. — AVERAGES. 


129 


Whence,  l/ir=  0*2831,  //=  3-532.  The  arithmetic  mean  is  4  587, 
or  more  than  a  unit  greater. 

If  the  prices  of  a  commodity  at  different  places  or  times  are 
stated  in  the  form  "  so  much  for  a  unit  of  money,"  and  an  average 
price  obtained  by  taking  the  arithmetic  mean  of  the  quantities 
sold  for  a  unit  of  money,  the  result  is  equivalent  to  the  harmonic 
mean  of  prices  stated  in  the  ordinary  way.  Thus  retail  prices  of 
eggs  were  quoted  before  the  War  as  "so  many  to  the  shilling." 
Supposing  we  had  100  returns  of  retail  prices  of  eggs,  50  returns 
showing  twelve  eggs  to  the  shilling,  30  fourteen  to  the  shilling, 
and  20  ten  to  the  shilling ;  then  the  mean  number  per  shilling 
would  be  12*2,  equivalent  to  a  price  of  0-984d.  per  egg.  But 
if  the  prices  had  been  quoted  in  the  form  usual  for  other  com- 
modities, we  should  have  had  50  returns  showing  a  price  of  Id. 
per  egg,  30  showing  a  price  of  0"857d.,  and  20  a  price  of  l'2d. : 
arithmetic  mean  0*997d.,  a  slightly  greater  value  than  the  har- 
monic mean  of  0'984.  The  official  returns  of  prices  in  India  were, 
until  1907,  given  in  the  form  of  "Sers  (2*057  lbs.)  per  rupee." 
The  average  annual  price  of  a  commodity  was  based  on  half- 
monthly  prices  stated  in  this  form,  and  "index-numbers"  were 
calculated  from  such  annual  averages.  In  the  issues  of  "  Prices 
and  Wages  in  India"  for  1908  and  later  years  the  prices  have 
been  stated  in  terms  of  "rupees  per  maund  (82-286  lbs.)."  The 
change,  it  will  be  seen,  amounts  to  a  replacement  of  the  harmonic 
by  the  arithmetic  mean  price. 

The  harmonic  mean  of  a  series  of  quantities  is  always  lower 
than  the  geometric  mean  of  the  same  quantities,  and,  a  fortiori^ 
lower  than  the  arithmetic  mean,  the  amount  of  difference  depend- 
ing largely  on  the  magnitude  of  the  dispersion  relatively  to  the 
magnitude  of  the  mean.    {Cf.  Question  9,  Chap.  VIII.) 

REFERENCES. 
General. 

(1)  Fechner,  G.  T.    "Ueber  den  Ausgangswerth  der  kleinsten  Abweich- 

ungssumme,  dessen  Bestlmmung,  Verwendung  und  Verallgemein- 
erurig,"  Ahh.  d.  kgl.  sdchsischen  Gesellschaft  d.  Wissenschaften,  vol. 
xviii.  (also  numbered  xi.  of  the  Abh.  d.  math.-phys.  Classe);  Leipzig 
(1878),  p.  1.  (The  average  defined  as  the  origin  from  which  the 
dispersion,  measured  in  one  way  or  another,  is  a  minimum :  geometric 
mean  dealt  with  incidentally,  pp.  13-16.) 

(2)  Fechner,  G.  T.,  Kollektivmasslehre,  herausgegeben  von  G.  F.  Lipps ; 

Engelmann,  Leipzig,  1897.  (Posthumously  published:  deals  with 
frequency-distributions,  their  forms,  averages,  and  measures  of  dis- 
persion in  general :  includes  much  of  the  matter  of  (1).) 

(3)  ZizEK,  Franz,  Die  statistischen Mittelwerthe;  DunckerundHurablot, Leipzig, 

1908  :  English  translation,  statistical  Averages,  translated  with  addi- 
tional notes,  etc.,  by  W.  M.  Persons,  Holt&  Co.,  New  York,1913.  (Non- 
mathematical,  but  useful  to  the  economic  student  for  references  cited.) 

9 


130 


THEORY  OF  STATISTICS. 


The  Geometric  Mean. 

(4)  Jevons,  "W.  StanleYj  A  Serious  Fall  in  the  Value  of  Gold  ascertained 
and  its  Social  Effects  set  forth;  Stanford,  London,  1863.  Reprinted 
in  Investigations  in  Currency  and  Finance  ;  Macmillan,  London,  1884. 
(The  geometric  mean  applied  to  the  measurement  of  price  changes. ) 

(6)  Jevons,  W,  Stanley,  "On  the  Variation  of  Prices  and  the  Value  of 
the  Currency  since  1782,"  Jour.  Roy.  Stat.  Soc,  vol.  xxviii.,  1865. 
Also  reprinted  in  volume  cited  above. 

(6)  Edgeworth,  F.  Y.,  "On  the  Method  of  ascertaining  a  Change  in  the 

Value  of  Gold,''  Jour.  Roy,  Stat.  Soc,  vol.  xlvi.,  1883,  p.  714.  (Some 
criticism  of  the  reasons  assigned  by  Jevons  for  the  use  of  the  geometric 
mean. ) 

(7)  Galton,  Francis,  "The  Geometric  Mean  in  Vital  and  Social  Stati.stics, " 

Proc.  Roy.  Soc,  vol.  xxix.,  1879,  p.  365. 

(8)  McAlister,  Donald,  "  The  Law  of  the  Geometric  Mean,"  ibid.,  p.  367. 

(The  law  of  frequency  to  which  the  use  of  the  geometric  mean  would 
be  appropriate. ) 

(9)  Kapteyn,   J.   C,  Skew  Frequency -curves  in  Biology  and  Statistics  ; 

Noordhoff,  Groningen,  and  Wm.  Dawson,  London,  1903.  (Contains, 
amongst  other  forms,  a  generalisation  of  McAlister's  law. ) 

(10)  Crawford,  G.  E.,  "An  Elementary  Proof  that  the  Arithmetic  Mean 

of  any  number  of  Positive  Quantities  is  greater  than  the  Geometric 
Mean,"  Proc  Edin.  Math.  Soc,  vol.  xviii.,  1899-1900. 
See  also  refs.  1  and  2. 

The  Mode. 

(11)  Pearson,  Karl,  "Skew  Variation  in  Homogeneous  Material,*'  Phil. 

Trans.  Roy.  Soc,  Series  A,  vol.  clxxxvi.,  1895,  p.  343.  (Definition  of 
mode,  p.  345.) 

(12)  Yule,  G.  U.,  "Notes  on  the  History  of  Pauperism  in  England  and 

Wales,  etc.  :  Supplementary  Note  on  the  Determination  of  the  Mode," 
Jour.  Roy.  Stat.  Soc,  vol.  lix.,  1896,  p.  343.  (The  note  deals  with 
elementary  methods  of  approximately  determining  the  mode  :  the  one- 
third  rule  and  one  other. ) 

(13)  Pearson,  Karl,  "On  the  Modal  Value  of  an  Organ  or  Character," 

Biometrika,  vol.  i.,  1902,  p.  260.  (A  warning  as  to  the  inadequacy  of 
mere  inspection  for  determining  the  mode. ) 

Estimates  of  Population. 

(14)  Waters,  A.  C,  "A  Method  f)r  estimating  Mean  Populations  in  the 

last  Intercensal  Period,"  Jour.  Roy.  Stat.  Soc,  vol.  Ixiv,,  1901,  p.  293. 
(16)  Waters,  A.  C,  Estimates  of  Population  :  Supplement  to  Annwil  Report  of 
the  Registrar- General  for  England  and  Wales  {Cd.  2618,  1907,  p.  cxvii.) 

For  the  methods  actually  used,  see  the  Reports  of  the  Registrar -General 
of  England  and  Wales  for  1907,  pp.  cxxxii-cxxxiv,  and  for  1910, 
pp.  xi-xii.  Cf.  Snow,  ref.  11,  Chap.  XII.,  for  a  different  method 
based  on  the  symptoms  of  growth  such  as  numbers  of  births  or  of  houses. 

Index-numbers. 

These  were  incidentally  referred  to  in  §  25.    The  general  theory  of 
index-numbers  and  the  different  methods  in  which  they  may  be  formed 
are  not  considered  in  the  present  work.    The  student  will  find  copious 
references  to  the  literature  in  the  following  : — 
(16)  Edgeworth,  E.  Y.,  "Reports  of  the  Committee  appointed  for  the 


VII. — AVERAGES. 


131 


•  purpose  of  investigating  the  best  methods  of  ascertaining  and  measuring 
Variations  in  the  Vahie  of  the  Monetary  Standard,"  British  Association 
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EXERCISES. 

1.  Verify  the  following  means  and  medians  from  the  data  of  Table  VI., 
Chap.  VI.,  p.  88. 

Stature  in  Inches  for  Adult  Males  in — 

England.     Scotland.       Wales.  Ireland. 

Mean     .       .       .    67-31          68*55          66*62  6778 

Median  .       .       .    67-35         68 '48         66-56  67-69 

In  the  calculation  of  the  means,  use  the  same  arbitrary  origin  as  in  Example 
ii.,  and  check  your  work  by  the  method  of  §  13  (&). 

2.  Find  the  mean  weight  of  adult  males  in  the  United  Kingdom  from  the 
data  in  the  last  column  of  Table  IX.,  Chap  VI.,  p.  95.  Also  find  the  median 
weight,  and  hence  the  approximate  mode,  by  the  method  of  §  20. 

3.  Similarly,  find  the  mean,  median,  and  approximate  value  of  the  mode 
for  the  distribution  of  fecundity  in  race-horses.  Table  X.,  Chap.  VI.,  p.  96. 

4.  Using  a  graphical  method,  find  the  median  annual  value  of  houses 
assessed  to  inhabited  house  duty  in  the  financial  year  1885-6  from  the  data 
of  Table  IV.,  Chap.  VI.,  p.  83. 

5.  (Data  from  Sauerbeck,  Jour.  Roy.  Stat.  Soe.,  March  1909.)  The  figures 
in  columns  1  and  2  of  the  small  table  below  show  the  index-numbers  (or  per- 
centages) of  prices  of  certain  animal  foods  in  the  years  1898  and  1908,  on 
their  average  prices  during  the  years  1867-77.  In  column  3  have  been  added 
the  ratios  of  the  index-numbers  in  1908  to  the  index-numbers  in  1898,  the 
latter  being  taken  as  100. 

Find  the  average  ratio  of  prices  in  1908  to  prices  in  1898,  taken  as  100  :  — 

(1)  From  the  arithmetic  mean  of  the  ratios  in  col.  3. 

(2)  From  the  ratio  of  the  arithmetic  means  of  cols.  1  and  2. 

(3)  From  the  ratio  of  the  geometric  means  of  cols.  1  and  2. 

(4)  From  the  geometric  mean  of  the  ratios  in  col.  3. 

Note  that,  by  §  25,  the  last  two  methods  must  give  the  same  result. 


Index-  number  of  price  in 

Ratio 

Commodity. 

1898. 

1908. 

08/98. 

1. 

2. 

3. 

1. 

Beef,  prime 

78 

88 

112-8 

2. 

Beef,  middling  . 

72 

90 

125-0 

8. 

Mutton,  prime  . 

84 

92 

109-5 

4. 

Mutton,  middling 

67 

95 

141-8 

5. 

Pork  

87 

83 

95  -4 

6. 

Bacon  .... 

78 

84 

107-7 

7. 

Butter  .... 

76 

91 

119-7 
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6.  (Data  from  census  of  1901.)  The  table  below  shows  the  population  of 
the  rural  sanitary  districts  of  Essex,  the  urban  sanitary  districts  (other  than 
the  borough  of  West  Ham),  and  the  borough  of  West  Ham,  at  the  censuses 
of  1891  and  1901.  Estimate  the  total  population  of  the  county  at  a  date 
midway  between  the  two  censuses,  (1)  on  the  assumption  that  the  percentage 
rate  of  increase  is  constant  for  the  county  as  a  whole,  (2)  on  the  assumption 
that  the  percentage  rate  of  increase  is  constant  in  each  group  of  districts  and 
the  borough  of  West  Ham. 


Essex. 

Population. 

1891. 

1901. 

Rural  districts 

West  Ham  .... 
Other  urban  districts 

Total 

232,867 
204,903 
345,604 

240,776 
267,358 
575,864 

783,374 

1,083,998 

7.  (Data  from  Agricultural  Statistics  for  1905,  Cd.  3061,  1906.)  The 
following  statement  shows  the  monthly  average  prices  of  eggs  in  Great 
Britain  in  1905,  as  compiled  from  the  weekly  returns  of  market  prices  for 
first  and  second  quality  British  eggs,  per  120  : — 


Month. 

First 
Quality. 

Second 
Quality. 

s.  d. 

s.  d. 

January        .  . 

13  0 

11  0 

February 

11  0 

9  0 

March  .... 

8  0 

6  0 

April  .... 

7  6 

6  6 

May  .... 

8  0 

7  6 

June  .... 

8  6 

8  0 

July  .... 

9  6 

8  6 

August  .... 

11  0 

10  0 

September 

11  6 

10  6 

October  .... 

14  0 

12  6 

November 

18  0 

16  0 

December 

17  6 

15  0 

Mean  for  year 

11  H 

10  0^ 

What  would  have  been  the  mean  price  for  the  year  in  each  case  if  the  whole- 
sale prices  had  been  recorded  in  the  same  way  as  retail  prices,  i.e.  at  so  many 
eggs  per  shilling  ?  State  your  answer  in  the  form  of  the  equivalent  price  per 
120,  and  obtain  it  in  the  shortest  way  by  taking  the  harmonic  mean  of  the 
above  prices  (c/,  §  27). 

8.  Supposing  the  frequencies  of  values  0,  1,  2,  ...  of  a  variable  to  be 
given  by  the  terms  of  the  binomial  series 

q»,  n.r'-^.p,  ^^^"^V'^^-P^  .... 

where  p  +  q  =  l,  find  the  mean. 
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MEASURES  OF  DISPERSION,  ETC. 

1.  Inadequacy  of  the  range  as  a  measure  of  dispersion  — 2-13.  The  standard 
deviation:  its  definition,  calculation,  and  properties— 14-19.  The 
mean  deviation  :  its  definition,  calculation,  and  properties— 20-24.  The 
quartile  deviation  or  semi-interquartile  range — 25.  Measures  of 
relative  dispersion— 26.  Measures  of  asymmetry  or  skewness — 27-30. 
The  method  of  grades  or  percentiles. 

1.  The  simplest  possible  measure  of  the  dispersion  of  a  series  of 
values  of  a  variable  is  the  actual  range,  i.e.  the  difference  between 
the  greatest  and  least  values  observed.  While  this  is  frequently 
quoted,  it  is  as  a  rule  the  worst  of  all  possible  measures  for  any 
serious  purpose.  There  are  seldom  real  upper  and  lower  limits 
to  the  possible  values  of  the  variable,  very  large  or  very  small 
values  being  only  more  or  less  infrequent :  the  range  is  therefore 
subject  to  meaningless  fluctuations  of  considerable  magnitude 
according  as  values  of  greater  or  less  in  frequency  happen  to 
have  been  actually  observed.  Note,  for  instance,  the  figures  of 
Table  IX.,  Chap.  VI.  p.  95,  showing  the  frequency  distributions  of 
weights  of  adult  males  in  the  several  parts  of  the  United  King- 
dom. In  Wales,  one  individual  was  observed  with  a  weight  of 
over  280  lbs.,  the  next  heaviest  being  under  260  lbs.  The 
addition  of  the  one  very  exceptional  individual  has  increased  th': 
range  by  some  30  lbs.,  or  about  one-fifth.  A  measure  subject  to 
erratic  alterations  by  casual  influences  in  this  way  is  clearly  not 
of  much  use  for  comparative  purposes.  Moreover,  the  measure 
takes  no  account  of  the  form  of  the  distribution  within  the  limits 
of  the  range  ;  it  might  well  happen  that,  of  two  distributions 
covering  precisely  the  same  range  of  variation,  the  one  showed 
the  observations  for  the  most  part  closely  clustered  round  the 
average,  while  the  other  exhibited  an  almost  even  distribution  of 
frequency  over  the  whole  range.  Clearly  we  should  not  regard 
two  such  distributions  as  exhibiting  the  same  dispersion,  though 
they  exhibit  the  same  range.  Some  sort  of  measure  of  dispersion 
is  therefore  required,  based,  like  the  averages  discussed  in  the  last 
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chapter,  on  all  the  observations  made,  so  that  no  single  observation 
can  have  an  unduly  preponderant  effect  on  its  magnitude  ;  indeed, 
the  measure  should  possess  all  the  properties  laid  down  as  desir- 
able for  an  average  in  §  4  of  Chap.  VII.  There  are  three  such 
measures  in  common  use — the  standard  deviation,  the  mean 
deviation,  and  the  quartile  deviation  or  semi-interquartile  range, 
of  which  the  first  is  the  most  important. 

2.  llie  Standard  Deviation. — The  standard  deviation  is  the 
square  root  of  the  arithmetic  mean  of  the  squares  of  all  deviations, 
deviations  being  measured  from  the  arithmetic  mean  of  the 
observations.  If  the  standard  deviation  be  denoted  by  cr,  and  a 
deviation  from  the  arithmetic  mean  by  x,  as  in  the  last  chapter, 
then  the  standard  deviation  is  given  by  the  equation 

,7'=  =  is(«^)     .      .      .      .  (1) 

To  square  all  the  deviations  may  seem  at  first  sight  an  artificial 
procedure,  but  it  must  be  remembered  that  it  would  be  useless  to 
take  the  mere  sum  of  the  deviations,  in  order  to  obtain  a  measure 
of  dispersion,  since  this  sum  is  necessarily  zero  if  deviations  be 
taken  from  the  mean.  In  order  to  obtain  some  quantity  that 
shall  vary  with  the  dispersion  it  is  necessary  to  average  the 
deviations  by  a  process  that  treats  them  as  if  they  were  all  of  the 
same  sign,  and  squaring  is  the  simplest  process  for  eliminating 
signs  which  leads  to  results  of  algebraical  convenience. 

3.  A  quantity  analogous  to  the  standard  deviation  may  be 
defined  in  more  general  terms.  Let  A  be  any  arbitrary  value  of 
X,  and  let  ^  (as  in  Chap.  VII.  §  8)  denote  the  deviation  of  X 
from  A  ;  i.e.  let 

^  =  X~A. 

Then  we  may  define  the  root-mean-square  deviation  s  from  the 
origin  A  by  the  equation 

s2  =  i2(a.       ...  (2) 

In  terms  of  this  definition  the  standard  deviation  is  the  root- 
mean-square  deviation  from  the  mean.  There  is  a  very  simple 
relation  between  the  standard  deviation  and  the  root-mean-square 
deviation  from  any  other  origin.  Let 

M-A==d  (3) 

so  that  ^  =  x  +  d. 


Then 


^^  =  x'^  +  2x.d  +  d\  ^ 
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But  the  sum  of  the  deviations  from  the  mean  is  zero,  therefore 
the  second  term  vanishes,  and  accordingly 

.,2.^^2  +  ^2.         ...  (4) 

Hence  the  root-mean-square  deviation  is  least  when  deviations 
are  measured  from  the  mean,  i.e.  the  standard  deviation  is  the  least 
possible  root-mean-square  deviation. 

or  %{f.^")  if  we  are  dealing  with  a  grouped  distribution 
and /is  the  frequency  of  |,  is  sometimes  termed  the  second  moment 
of  the  distribution  about  A,  just  as  %{^)  or  is  termed 

the  first  moment  {cf.  Chap.  VII.  §  8) :  we  shall  not  make  use 
of  the  term  in  the  present  work.  Generally,  %(f.^^)  is  termed 
the  nth  moment. 

4.  If  a  and  d  are  the  two  sides  of  a  right-angled  triangle,  s  is 


the  hypotenuse.  If,  then,  MB  be  the  vertical  through  the 
mean  of  a  frequency-distribution  (fig.  25),  and  i/>S'  be  set  off 
equal  to  the  standard  deviation  (on  the  same  scale  in  which  the 
variable  X  is  plotted  along  the  base),  >S'^  will  be  the  root-raean- 
square  deviation  from  the  point  A.  This  construction  gives  a 
concrete  idea  of  the  way  in  which  the  root-mean-square  deviation 
depends  on  the  origin  from  which  deviations  are  measured.  It 
will  be  seen  that  for  small  values  of  d  the  difference  of  s  from  a 
will  be  very  minute,  since  A  will  lie  very  nearly  on  the  circle 
drawn  through  M  with  centre  *S'  and  radius  SM:  slight  errors 
in  the  mean  due  to  approximations  in  calculation  will  not,  there- 
fore, appreciably  affect  the  value  of  the  standard  deviation. 

5.  If  we  have  to  deal  with  relatively  few,  say  thirty  or  forty, 
ungrouped  observations,  the  method  of  calculating  the  standard 
deviation  is  perfectly  straightforward.  It  is  illustrated  by  the 
figures  given   below  for   the   estimated   average   earnings  of 
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agricultural  labourers  in  38  rural  unions.  The  values  (earnings) 
are  first  of  all  totalled  and  the  total  divided  by  iV  to  give  the 
arithmetic  mean  M\  viz.  15s.  lljfd.,  or  15s.  lid.  to  the  nearest 
penny.  The  earnings  being  estimates,  it  is  not  necessary  to  take 
the  average  to  any  higher  degree  of  accuracy.  Having  found 
the  mean,  the  difference  of  each  observation  from  the  mean  is 
next  written  down  as  in  col.  3,  one  penny  being  taken  as  the 
unit  :  the  signs  are  not  entered,  as  they  are  not  wanted,  but  the 
work  should  be  checked  by  totalling  the  positive  and  negative 
differences  separately.  [The  positive  total  is  300  and  the 
negative  290,  thus  checking  the  value  for  the  mean,  viz.  15s. 
lid. +  10/38.] 

Finally,  each  difference  is  squared,  and  the  squares  entered  in 
col.  4, — tables  of  squares  are  useful  for  such  work  if  any  of  the 
differences  to  be  squared  are  large  (see  list  of  Tables,  p.  356). 
The  sum  of  the  squares  is  16,018.  Treating  the  value  taken  for 
the  mean  as  sensibly  accurate,  we  have — 

,  16018 
(T=20-5d 

If  we  wish  to  be  more  precise  we  can  reduce  to  the  true  mean 
by  the  use  of  equation  (4),  as  follows  : — 

,.  =  1^  .421-5263 

cf=12  =  0-2632;  d'^=  00693 
oo 

Hence  <t'^  =    -  <P  =  i2\ -4570 

o-=  20-529c?. 


Evidently  this  reduction,  in  the  given  case,  is  unnecessary, 
illustrating  the  fact  mentioned  at  the  end  of  §  4,  that  small 
errors  in  the  mean  have  little  effect  on  the  value  found  for  the 
standard  deviation.  The  first  value  is  correct  within  a  very 
small  fraction  of  a  penny. 
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CALOaLATiON  OF  THE  STANDARD  DEVIATION:  Example  i. — Calculation  of 
Mean  and  Standard  Deviation  for  a  Short  Series  of  Observations  un- 
grouped.  Estimated  Average  Weekly  Earnings  of  A gricultural  Labourers 
in  Thirty-eight  Rural  Unions ,  in  1892-3.  (W.  Little  :  Labour  Com- 
mission; Report,  vol.  v.,  parti.,  1894.) 


1. 

2. 

3. 

4. 

Union. 

Earnings 
(Shillings 
and  Pence). 

Difference 
^  (Pence). 

(Difference)^ 

1.  Glendale  .... 

2.  Wigton  .... 

3.  Garstang    .       ,  , 

4.  Helper  .... 

5.  Nantwich  .... 

6.  Atcham  .... 

7.  Driffield  .... 

8.  Uttoxeter  .... 

9.  Wetherby  .... 

10.  Easingwold 

11.  Southwell  .... 

12.  Hollingbourn 

13.  Melton  Mowbray 

14.  Truro  .... 

15.  Godstone  .... 

16.  Louth  .... 

17.  Brixworth  .... 

18.  Crediton  .... 

19.  Holbeach  .... 

20.  Maldon  .... 

21.  Monmouth 

22.  StNeots  .... 

23.  Swaffham  .... 

24.  Thakeham. 

25.  Thame  .... 

26.  Thingoe  .... 

27.  Basingstoke 

28.  Cirencester 

29.  N.  Witch  ford  . 

30.  Pewsey  .... 

31.  Bromyard  .... 

32.  Wantage  .... 

33.  Stratford-on-Avon 

34.  Dorchester 

35.  Woburn  .... 

36.  Buntingford 

37.  Pershore  .... 

38.  Langport  .... 

s.  d. 

20  9 
20  3 
19  8 
18  6 
17  8 
17  6 
17  1 
17  0 
17  0 
16  11 
16  6 
16  4 
16  3 
16  3 
16  0 
16  0 
15  9 
15  8 
15  6 
15  6 
15  4 
15  3 
15  0 
15  0 
15  0 
15  0 
15  0 
15  0 
14  10 
14  9 
14  9 
14  9 
14  7 
14  6 
14  6 
14  4 
13  6 
12  6 

58 
52 
45 
31 
21 
19 
14 
13 
13 
12 
7 
5 
4 
4 
1 
1 
2 
3 
5 
5 
7 
8 

11 
11 

11 
11 
11 
11 
13 
14 
14 
14 
16 
17 
17 
19 
29 
41 

3,364 
2,704 
2,025 
961 
441 
361 
19 1) 
169 
169 
144 
49 
25 
16 
16 
1 
1 
4 
9 
25 
25 
49 
64 
121 
121 
121 
121 
121 
121 
169 
196 
196 
196 
256 
289 
289 
361 
841 
1,681 

Total 

605    8  1 

+  300 
-290 

1  16,018 
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The  figures  dealt  with  in  this  illustration  are  estimates  of  the 
weekly  earnings  of  the  agricultural  labourers,  i.e.  they  include 
allowances  for  gifts  in  kind,  such  as  coal,  potatoes,  cider,  etc.  The 
estimated  weekly  money  wages  are,  however,  also  given  in  the 
same  Report,  and  we  are  thus  enabled  to  make  an  interesting 
comparison  of  the  dispersions  of  the  two.  It  might  be  expected 
that  earnings  would  vary  less  than  wages,  as  his  earnings  and  not 
the  mere  money  wages  he  receives  are  the  important  matter  to 
the  labourer,  and  as  a  fact  we  find 

Standard  deviation  of  weekly  earnings      ,       .  20'5d. 
,,  „  „  wages  .    26 -Od. 

The  arithmetic  mean  wage  is  13s.  5d. 

6.  If  we  have  to  deal  with  a  grouped  frequency-distribution, 
the  same  artifices  and  approximations  are  used  as  in  the  calculation 
of  the  mean  (Chap.  VII.  §§  8,  9,  10).  The  mid-value  of  one  of 
the  class-intervals  is  chosen  as  the  arbitrary  origin  A  from  which 
to  measure  the  deviations  ^,  the  class-interval  is  treated  as  a 
unit  throughout  the  arithmetic,  and  all  the  observations  within 
any  one  class-interval  are  treated  as  if  they  were  identical  with 
the  mid-value  of  the  interval.  If,  as  before,  we  denote  the 
frequency  in  any  one  interval  by  /,  these  /  observations  con- 
tribute /^^  to  the  sum  of  the  squares  of  deviations  and  we 
have — 

The  standard  deviation  is  then  calculated  from  equation  (4). 

7.  The  whole  of  the  work  proceeds  naturally  as  an  extension  of 
that  necessary  for  calculating  the  mean,  and  we  accordingly  use 
the  same  illustrations  as  in  the  last  chapter.  Thus  in  Example 
ii.  below,  cols.  1,  2,  3,  and  4  are  the  same  as  those  we  have  already 
given  in  Example  i.  of  Chap.  VII.  for  the  calculation  of  the  mean. 
Column  5  gives  the  figures  necessary  for  calculating  the  standard 
deviation,  and  is  derived  directly  from  col.  4  by  multiplying  the 
figures  of  that  column  again  by  |.  Thus  90  x  5  =  450,  192  x  4  = 
768,  and  so  on.  The  work  is  therefore  done  very  rapidly.  The 
remaining  steps  of  the  arithmetic  are  given  below  the  table  ;  the 
student  must  be  careful  to  remember  the  final  conversion,  if 
necessary,  from  the  class-interval  as  unit  to  the  natural  unit 
of  measurement.  In  this  case  the  value  fonnd  is  2*48  class- 
intervals,  and  the  class-interval  being  half  a  unit,  that  is  124 
per  cent. 
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Calculation  of  the  Standard  Devfation  :  Example  ii. — Calculation  of 
the  Standard  JJeviation  of  the  Percentages  of  the  Population  in  receipt  of 
Relief,  in  addition  to  the  Mean,  from  the  figures  of  Table  VIII.  oj 
Chap.  VI.    {Cf.  the  work  for  the  mean  alone,  p.  111.) 


(1) 

Percentage 
in  receipt 
of  Relief. 

(2) 

Frequency. 
/. 

(3) 

Deviation 
from  Valued. 
|. 

(4) 

Product. 
fh 

(5) 

Product. 
ft' 



I 

—  

1  8 
1  o 

-  5 



90 

450 

1  '5 

48 

-  4 

192 

768 

9 

79 
/  ii 

O 

216 

648 

89 

~  2 

I/O 

356 

3 

100 

_  1 

100 

100 

3-6 

90 

0 

-776 

4 

75 

+  1 

75 

75 

4-5 

60 

+  2 

120 

240 

5 

40 

+  3 

120 

360 

5-5 

21 

4-  4 

84 

336 

6 

11 

+  5 

55 

275 

6-5 

5 

+  6 

30 

180 

7 

1 

+  7 

7 

49 

7-5 

1 

+  8 

8 

64 

8 

+  9 

8-5 

*  1 

+  10 

10 

100 

Total 

632 

+  509 

4001 

From  previous  work,  p.  Ill,  M-  A=d=  -0*4225  class-intervals. 
2(/|2)_4001_ 

,\    0-2  =  6-3307 -(-4225)2 
=  6-1522. 

<r  =2*48  intervals  =1*24  per  cent. 

To  illustrate  again  the  value  of  the  standard  deviation  for 
purposes  of  comparison,  figures  are  given  below  showing  the 
means  and  standard  deviations  of  similar  distributions  for  a  series 
of  years  from  1850.  It  will  be  seen  that  not  only  did  the  mean 
decrease  during  the  period,  but  the  standard  deviation  decreased 
to  an  equally  marked  extent,  having  been  halved  between 
1850  and  1891  ;  the  average  was  lowered,  and  at  the  same  time 
the  percentages  of  the  population  in  receipt  of  relief  clustered 
much  more  closely  round  the  lower  average. 


uo 


THEORY  OF  STATISTICS. 


M^ans  arid  Standard  Deviations  of  the  Distributions  of  Pauperism  {Percentage 
of  the  Population  in  receipt  of  Poor-law  Relief)  in  the  Unions  of  England 
and  Wales  since  1850.  (From  Yule,  Jour.  Roy.  Stat.  Soc,  vol.  lix., 
1896,  figures  slightly  amended.) 


Year. 

Percentage  of  the  Population 
in  receipt  of  Relief. 

Arithmetic 
Mean, 

Standard 
Deviation. 

1850 

6-51 

2-50 

1860 

5-20 

2-07 

1870 

5-45 

2-02 

1881 

3-68 

1  36 

1891 

3-29 

1-24 

8.  In  the  table  given  on  p.  141  (Example  iii.),  the  calculation  of 
the  standard  deviation  is  similarly  shown  for  the  distribution  of 
the  statures  of  adult  males  in  the  British  Isles,  the  work  being 
continued  from  the  stage  which  it  reached  for  the  calculation  of 
the  mean  in  Example  ii.  of  Chap.  VII.  The  steps  of  the  arith- 
metic hardly  call  for  further  explanation,  but  it  may  be  noted  that 
the  class-interval  being  a  unit  in  this  case,  no  conversion  of 
the  standard  deviation  from  class-intervals  to  units  is  required. 

9.  The  student  must  remember,  as  in  the  case  of  the  calculation 
of  the  mean,  that  the  treatment  of  all  values  within  each  class- 
interval  as  if  they  were  identical  with  the  mid-value  of  the  interval 
is  an  approximation  and  no  more  (cf.  Chap.  VII.  §  11),  though, 
for  a  distribution  of  the  symmetrical  or  moderately  asymmetrical 
type  with  a  class-interval  not  greater  than  one-twentieth  or  so 
of  the  range,  the  approximation  may  be  a  very  close  one.  But 
while  the  value  of  the  arithmetic  mean  may  be  either  increased 
or  decreased  by  grouping,  in  the  case  of  distributions  which  are 
not  more  than  slightly  asymmetrical,  the  standard  deviation  of 
such  distributions  tends  to  be  increased,  and  the  increase  is  the 
greater  the  cruder  the  grouping.  We  give  an  approximate 
correction  for  this  effect  later  (Chap.  XI.  §  4).  The  student  is 
recommended  to  test  for  himself  the  effect  of  grouping  in  two 
or  three  cases. 

10.  It  is  a  useful  empirical  rule  to  remember  that  a  range  of 
six  times  the  standard  deviation  usually  includes  99  per  cent,  or 
more  of  all  the  observations  in  the  case  of  distributions  of  the 
symmetrical  or  moderately  asymmetrical  type.    Thus  in  Example 
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Calculation  of  the  Standard  Deviation  :  Example  iii. — Calculation 
of  the  Standard  Deviation  of  Stature  of  Male  Adults  in  the  British  Isles 
from  the  figures  of  Table  VI.,  p.  88.  {Cf.  p.  112  for  the  calculation  of 
mean  alone. ) 


(1) 

(2) 

(3) 

(4) 

(5) 

from 

f 
J- 

Value  A, 

t 

57- 

o 
J. 

—  lU 

OA 

58- 

4 

-  9 

36 

324 

oy— 

1  A 

o 

—  c5 

60- 

41 

-  7 

287 

2,009 

61- 

83 

-  6 

498 

2,988 

62- 

169 

-  5 

845 

4,225 

63- 

394 

-  4 

1576 

6,304 

64- 

669 

-  3 

2007 

6,021 

65- 

990 

-  2 

1980 

3,960 

66- 

1223 

-  1 

1223 

1,223 

67- 

1329 

0 

-  8584 

— 

uo— 

J.  iiOxj 

i 

1  ZoU 

69- 

1063 

+  2 

2126 

4,252 

70- 

646 

+  3 

1938 

5,814 

71- 

392 

+  4 

1568 

6,272 

72- 

202 

+  5 

1010 

5,050 

73- 

79 

+  6 

474 

2,844 

74- 

32 

+  7 

224 

1,568 

75- 

16 

+  8 

128 

1,024 

76- 

5 

+  9 

45 

405 

77- 

2 

+  10 

20 

200 

Total 

8585 

+  8763 

56,809 

From  previous  work,  M-  A-d=  ->t  "0209  class-intervals  or  inches. 


<r2  =  6-6172- (-0209)2 
=  6  6168. 

. tr  =  2 '57  class-intervals  or  inches. 

ii.  the  standard  deviation  is  1*24  per  cent. ;  six  times  this  is  7 '4  4 
per  cent.,  and  a  range  from  0*75  to  8*19  per  cent,  includes  all 
but  one  observation  out  of  632.  In  Example  iii.  the  standard 
deviation  is  2*57  in.,  six  times  this  is  15'42  in.,  and  a  range  from, 
say,  60  in.  to  75*4  in.  includes  all  but  some  37  out  of  8585 
individuals,  i.e.  about  99-6  per  cent.    This  rough  rule  serves  to 
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give  a  more  definite  and  concrete  meaning  to  the  standard 
deviation,  and  also  to  check  arithmetical  work  to  some  extent — 
sufficiently,  that  is  to  say,  to  guard  against  very  gross  blunders. 
It  must  not  be  expected  to  hold  for  short  series  of  observations  : 
in  Example  i.,  for  instance,  the  actual  range  is  a  good  deal  less 
than  six  times  the  standard  deviation. 

11.  The  standard  deviation  is  the  measure  of  dispersion  which 
it  is  most  easy  to  treat  by  algebraical  methods,  resembling  in  this 
respect  the  arithmetic  mean  amongst  measures  of  position.  The 
majority  of  illustrations  of  its  treatment  must  be  postponed  to  a 
later  stage  (Chap.  XL),  but  the  work  of  §  3  has  already  served  as 
one  example,  and  we  may  take  another  by  continuing  the  work  of 
§13  (b),  Chap.  VII.  In  that  section  it  was  show^n  that  if  a  series 
of  observations  of  which  the  mean  is  M  consist  of  two  component 
series,  of  which  the  means  are      and  respectively, 

and  heing  the  numbers  of  observations  in  the  two  com- 
ponent series,  and  iV=  iVj -i- number  in  the  entire  series. 
Similarly,  the  standard  deviation  cr  of  the  whole  series  may  be 
expressed  in  terms  of  the  standard  deviations  o-j  and  a.^  of  the 
components  and  their  respective  means.  Let 

Then  the  mean-square  deviations  of  the  component  series  about 
the  mean  J/ are,  by  equation  (1),  a--^  +d^^  and  (T^^-\-d.^  respec- 
tively.   Therefore,  for  the  whole  series, 

N.<T'^  =  NlcT^-Vd^^)^Nlcx^^-d^)      .       .  (5) 

If  the  numbers  of  observations  in  the  component  series  be  equal 
and  the  means  be  coincident,  we  have  as  a  special  case — 

a2=J(<ri2  +  <T,2)        .        .        .        .  (6) 

so  that  in  this  case  the  square  of  the  standard  deviation  of  the 
whole  series  is  the  arithmetic  mean  of  the  squares  of  the  standard 
deviations  of  its  components. 

It  is  evident  that  the  form  of  the  relation  (5)  is  quite  general  : 
if  a  series  of  observations  consists  of  r  component  series  with 
standard  deviations  cr^,  0*2,  ..  .  (r„  and  means  diverging  from  the 
general  mean  of  the  whole  series  by  c?^  d^^  .  .  .  d„  the  standard 
deviation  cr  of  the  whole  series  is  given  (using  m  to  denote  any 
subscript)  by  the  equation — 

N.a^=^%{F^,aJ)-^^N^.dJ)    .        .        .  (7) 
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Again,  as  in  §  13  of  Chap.  VII.,  it  is  convenient  to  note,  for  the 
checking  of  arithmetic,  that  if  the  same  arbitrary  origin  be  used 
for  the  calculation  of  the  standard  deviations  in  a  number  of 
component  distributions  we  must  have 

2(/.0  =  2(/i.fi2)  +  2(/-,.f/)+  .  (8) 

12.  As  another  useful  illustration,  let  us  find  the  standard 
deviation  of  the  first  natural  numbers.  The  mean  in  this  case 
is  evidently  (#+l)/2.  Further,  as  is  shown  in  any  elementary 
Algebra,  the  sum  of  the  squares  of  the  first     natural  numbers  is 

iY(A^+ l)(2iy+l) 
6 

The  standard  deviation  a-  is  therefore  given  by  the  equation — 

o-2  =  J(iV+l)(2Ar+l)_i(ivr+l)2, 
that  is,  0-2  =  ^2(^72-1)    ....  (9) 

This  result  is  of  service  if  the  relative  merit  of,  or  the  relative 
intensity  of  some  character  in,  the  different  individuals  of  a  series 
is  recorded  not  by  means  of  measurements,  e.g.  marks  awarded  on 
some  system  of  examination,  but  merely  by  means  of  their 
respective  positions  when  ranked  in  order  as  regards  the  character, 
in  the  same  way  as  boys  are  numbered  in  a  class.  With  JV 
individuals  there  are  always  iV  ranks,  as  they  are  termed, 
whatever  the  character,  and  the  standard  deviation  is  therefore 
always  that  given  by  equation  (9). 

Another  useful  result  follows  at  once  from  equation  (9),  namely, 
the  standard  deviation  of  a  frequency-distribution  in  which  all 
values  of  X  within  a  range  ±1/2  on  either  side  of  the  mean  are 
equally  frequent,  values  outside  these  limits  not  occurring,  so  that 
the  frequency-distribution  may  be  represented  by  a  rectangle.  The 
base  I  may  be  supposed  divided  into  a  very  large  number  JV  of  equal 
elements,  and  the  standard  deviation  reduces  to  that  of  the  first  iV 
natural  numbers  when  JV  is  made  indefinitely  large.  The  single 
unit  then  becomes  negligible  compared  with  iV",  and  consequently 

....  (10) 

13.  It  will  be  seen  from  the  preceding  paragraphs  thai  the 
standard  deviation  possesses  the  majority  at  least  of  the  properties 
which  are  desirable  in  a  measure  of  dispersion  as  in  an  average 
(Chap.  VII.  4).  It  is  rigidly  defined ;  it  is  based  on  all  the 
observations  made  ;  it  is  calculated  with  reasonable  ease  ;  it  lends 
itself  readily  to  algebraical  treatment ;  and  we  may  add,  though  the 
student  will  have  to  take  the  statement  on  trust  for  the  present, 
that  it  is,  as  a  rule,  the  measure  least  affected  by  fluctuations  of 
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sampling.  On  the  other  hand,  it  may  be  said  that  its  general 
nature  is  not  very  readily  comprehended,  and  that  the  process  of 
squaring  deviations  and  then  taking  the  square  root  of  the  mean 
seems  a  little  involved.  The  student  will,  however,  soon  surmount 
this  feeling  after  a  little  practice  in  the  calculation  and  use  of  the 
constant,  and  will  realise,  as  he  advances  further,  the  advantages 
that  it  possesses.  Such  root-mean-square  quantities,  it  may  be 
added,  frequently  occur  in  other  branches  of  science.  The 
standard  deviation  should  always  be  used  as  the  measure  of  disper- 
sion, unless  there  is  some  very  definite  reason  for  preferring  another 
measure,  just  as  the  arithmetic  mean  should  be  used  as  the  measure 
of  position.  It  may  be  added  here  that  the  student  will  meet  with 
the  standard  deviation  under  many  different  names,  of  which  we 
have  adopted  the  most  recent  (due  to  Pearson,  ref.  2) :  many  of 
the  earlier  names  are  hardly  adapted  to  general  use,  as  they  bear 
evidence  of  their  derivation  from  the  theory  of  errors  of  observation. 
Thus  the  terms  "mean  error"  (Gauss),  "error  of  mean  square" 
(Airy),  and  "  mean  square  error  "  have  all  been  used  in  the  same 
sense.  The  standard  deviation  multiplied  by  the  square  root  of 
2  has  been  termed  the  "  modulus  "  (Airy), — the  student  will  see 
later  the  reason  for  the  adoption  of  the  factor — and  the  reciprocal 
of  the  modulus  the  "precision"  (Lexis).  For  the  square  of  the 
standard  deviation,  often  required,  R.  A.  Fisher  has  suggested 
the  term  "  variance." 

14.  The  Mean  Deviation. — The  mean  deviation  of  a  series  of 
values  of  a  variable  is  the  arithmetic  mean  of  their  deviations 
from  some  average,  taken  without  regard  to  their  sign.  The 
deviations  may  be  measured  either  from  the  arithmetic  mean  or 
from  the  median,  but  the  latter  is  the  natural  origin  to  use.  J ust 
as  the  root-mean-square  deviation  is  least  when  deviations  are 
measured  from  the  arithmetic  mean,  so  the  mean  deviation  is 
least  when  deviations  are  measured  from  the  median.  For 
suppose  that,  for  some  origin  exceeded  by  m  values  out  of  N,  the 
mean  deviation  has  a  value  A.  Let  the  origin  be  displaced  by 
an  amount  c  until  it  is  jast  exceeded  by  m  -  1  of  the  values  only, 
i.e.  until  it  coincides  with  the  mth  value  from  the  upper  end  of 
the  series.  By  this  displacement  of  the  origin  the  sum  of  devia- 
tions in  excess  of  the  origin  is  reduced  by  m.c,  while  the  sum  of 
deviations  in  defect  of  the  mean  is  increased  by  {N  —  m)c.  The 
new  mean  deviation  is  therefore 

{N  -  m)c  -  mc 
=  A  +  l(iV-2m)c. 
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The  new  mean  deviation  is  accordingly  less  than  the  old  so  long  as 

m>\N, 

That  is  to  say,  if  N  be  even,  the  mean  deviation  is  constant  for 
all  origins  within  the  range  between  the  Nl2lh.  and  the  (iy/2  +  l)th 
observations,  and  this  value  is  the  least :  if  iV  be  odd,  the  mean 
deviation  is  lowest  when  the  origin  coincides  with  the  (iV+  l)/2th 
observation.  The  mean  deviation  is  therefore  a  minimum  when 
deviations  are  measured  from  the  median  or,  if  the  latter  be 
indeterminate,  from  an  origin  within  the  range  in  which  it  lies. 

15.  The  calculation  of  the  mean  deviation  either  from  the  mean 
or  from  the  median  for  a  series  of  ungrouped  observations  is  very 
simple.  Take  the  figures  of  Example  i.  (p.  137)  as  an  illustration. 
We  have  already  found  the  mean  (15s.  lid.  to  the  nearest  penny), 
and  the  deviations  from  the  mean  are  written  down  in  column  3. 
Adding  up  this  column  without  respect  to  the  sign  of  the  devi- 
ations we  find  a  total  of  590.  The  mean  deviation  from  the  mean 
is  therefore  590/38  =  15'53d,  The  mean  deviation  from  the 
median  is  calculated  in  precisely  the  same  way,  but  the  median 
replaces  the  mean  as  the  origin  from  which  deviations  are  measured. 
The  median  is  15s.  6d.  The  deviations  in  pence  run  63,  57,  50, 
36,  and  so  on ;  their  sum  is  570 ;  and,  accordingly,  the  mean 
deviation  from  the  median  is  15d.  exactly. 

16.  In  the  case  of  a  grouped  frequency-distribution,  the  sum 
of  deviations  should  be  calculated  first  from  the  centre  of  the 
class-interval  in  which  the  mean  (or  median)  lies,  and  then 
reduced  to  the  mean  as  origin.  Thus  in  the  case  of  Example  ii. 
the  mean  is  3*29  per  cent,  and  lies  in  the  class-interval  centring 
round  3*5  per  cent.  We  have  already  found  that  the  sum  of 
deviations  in  defect  of  3*5  per  cent,  is  776,  and  of  deviations  in 
excess  509  :  total  (without  regard  to  sign)  1285, — the  unit  of 
measurement  being,  of  course,  as  it  is  necessary  to  remember,  the 
class-interval.  If  the  number  of  observations  below  the  mean  is 
tVj  and  above  the  mean  N^,  and  M  -  A=d,  as  before,  we  have  to 
add  JVyd  to  the  sum  found  and  subtract  iVg  c?.  In  the  present 
case  ii^j  =  327  and  iV2  =  305,  while  d=  -0-42  class-intervals, 
therefore 

d{N^  -  #2)  =  -  0-42  X  22  =  -  9-2, 

and  the  sum  of  deviations  from  the  mean  is  1285  -9*2  =  1275*8. 
Hence  the  mean  deviation  from  the  mean  is  1275-8/632  =  2  019 
class-intervals,  or  101  per  cent. 

17.  The  mean  deviation  from  the  median  should  be  found  in 
precisely  similar  fashion,  but  the  mid-value  of  the  interval  in 
which  the  median  (instead  of  the  mean)  lies  should,  for  con- 
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venience,  be  taken  as  origin.  Thus  in  Example  ii.  the  median  is 
(Chap.  VII.  §  15)  3-195  per  cent.  Hence  3'0  per  cent,  should  be 
taken  as  the  origin,  d=  +  0  39  intervals,  J}f^  =  327,  J^^  =  2^^-  The 
deviation-sum  with  3  0  as  origin  is  found  to  be  1263,  and  the 
correction  is  +  0  39  x  22  =  +  8  6.  Hence  the  mean  deviation 
from  the  median  is  2*012  intervals,  or  again  1*01  per  cent.  The 
value  is  roally  smaller  than  that  of  the  mean  deviation  from  the 
arithmetic  mean,  but  the  difference  is  too  slight  to  affect  the 
second  place  of  decimals. 

It  should  be  noted  that,  as  in  the  case  of  the  standard  deviation, 
this  method  of  calculation  implies  the  assumption  that  all  the 
values  of  X  within  any  one  class-interval  may  be  treated  as  if 
they  were  the  mid-value  of  that  interval.  This  is,  of  course,  an 
approximation,  but  as  a  rule  gives  results  of  amply  sufficient 
accuracy  for  practice  if  the  class-interval  be  kept  reasonably  small 
(c/.  again  Chap.  VI.  §  5).  We  have  left  it  as  an  exercise  to  the 
student  to  find  the  correction  to  be  applied  if  the  values  in  each 
interval  are  treated  as  if  they  were  evenly  distributed  over  the 
interval,  instead  of  concentrated  at  its  centre  (Question  7). 

18.  The  mean  deviation,  it  will  be  seen,  can  be  calculated  rather 
more  rapidly  than  the  standard  deviation,  though  in  the  case  of  a 
grouped  distribution  the  difference  in  ease  of  calculation  is  not 
great.  It  is  not,  on  the  other  hand,  a  convenient  magnitude  for 
algebraical  treatment ;  for  example,  the  mean  deviation  of  a  dis- 
tribution obtained  by  combining  several  others  cannot  in  general 
be  expressed  in  terms  of  the  mean  deviations  of  the  component 
distributions,  but  depends  upon  their  forms.  As  a  rule,  it  is  more 
affected  by  fluctuations  of  sampling  than  is  the  standard  deviation, 
but  may  be  less  affected  if  large  and  erratic  deviations  lying 
somewhat  beyond  the  bulk  of  the  distribution  are  liable  to  occur. 
This  may  happen,  for  example,  in  some  forms  of  experimental 
work,  and  in  such  cases  the  use  of  the  mean  deviation  may  be 
slightly  preferable  to  that  of  the  standard  deviation. 

19.  It  is  a  useful  empirical  rule  for  the  student  to  remember 
that  for  symmetrical  or  only  moderately  asymmetrical  distri- 
butions, approaching  the  ideal  forms  of  figs.  5  and  9,  the  mean 
deviation  is  usually  very  nearly  four-fifths  of  the  standard  devia- 
tion.   Thus  for  the  distribution  of  pauperism  we  have 


In  the  case  of  the  distribution  of  male  statures  in  the  British 
Isles,  Example  iii.,  the  ratio  found  is  0*80.  For  a  short  series  of 
observations  like  the  wage  statistics  of  Example  i.  a  regular  result 
could  hardly  be  expected:  the  actual  ratio  is  15*0/20"5  =  0  73. 


mean  deviation  TOl 


standard  deviation  1*24 
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We  pointed  out  in  §  10  that  in  distributions  of  the  simple  forms 
referred  to,  a  range  of  six  times  the  standard  deviation  contains 
over  99  per  cent,  of  all  the  observations.  If  the  mean  deviation 
be  employed  as  the  measure  of  dispersion,  we  must  substitute  a 
range  of  71  times  this  measure. 

20.  The  Quartile  Deviation  or  Semi-interquartile  Range. — If  a 
value  of  the  variable  be  determined  of  such  magnitude  tliat 
one-quarter  of  all  the  values  observed  are  less  than  and  three- 
quarters  greater,  then  is  termed  the  lower  quartile.  Similarly, 
if  a  value  Q.^  be  determined  such  that  three-quarters  of  all  the 
values  observed  are  less  than  and  one-quarter  only  greater, 
then  is  termed  the  upper  quartile.  The  two  quartiles  and  the 
median  divide  the  observed  values  of  the  variable  into  four 
classes  of  equal  frequency.  If  Mi  be  the  value  of  the  median,  in 
a  symmetrical  distribution 

Mi~Q,  =  Q^-Mi, 

and  the  difference  may  be  taken  as  a  measure  of  dispersion.  But 
as  no  distribution  is  rigidly  symmetrical,  it  is  usual  to  take  as  the 
measure 


and  Q  is  termed  the  quartile  deviation,  or  better,  the  semi- 
interquartile  range — it  is  not  a  measure  of  the  deviation  from 
any  particular  average :  the  old  name  probable  error  should  be 
confined  to  the  theory  of  sampling  (Chap.  XV.  §  17). 

21.  In  the  case  of  a  short  series  of  ungrouped  observations 
the  quartiles  are  determined,  like  the  median,  by  inspection. 
In  the  wage  statistics  of  Example  i.,  for  instance,  there  are 
38  observations,  and  38/4  =  9-5:  What  is  the  lower  quartile? 
The  student  may  be  tempted  to  take  it  halfway  between  the 
ninth  and  tenth  observations  from  the  bottom  of  the  list  ; 
but  this  would  be  wrong,  for  then  there  would  be  nine 
observations  only  below  the  value  chosen  instead  of  9*5.  The 
quartile  must  be  taken  as  given  by  the  tenth  observation 
itself,  which  may  be  regarded  as  divided  by  the  quartile,  and 
falling  half  above  it  and  half  below.  Therefore 

Lower  quartile  Q^  =  14s.  lOd. 
Upper  quartile  ^3=  16s.  lid. 

and  Q  =  ^^^^'  =  U-bd. 


22,  In  the  case  of  a  grouped  distribution,  the  quartiles,  like 
the  median,    are   determined    by   simple   arithmetical   or  by 
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graphical  interpolation  (c/.  Chap.  VII.  §§  15,  16).  Thus  for  the 
distribution  of  pauperism,  Example  ii.,  we  have 

632-^4=158  ^ 
Total  frequency  under  2*25  per  cent.  =  138 


Difference  =  20 
Frequency  in  interval  2*25  -  2  75  =  89 

Whence     =  2-25  +  ~  x  0-5  =  2-362  per  cent. 

Similarly  we  find  =4130  „ 

Hence  Q  =  ^-^^  ^OSSi 

It  is  left  to  the  student  to  check  the  value  by  graphical 
interpolation. 

23.  For  distributions  approaching  the  ideal  forms  of  figs. 
5  and  9,  the  semi-interquartile  range  is  usually  about  two-thirds 
of  the  standard  deviation.    Thus  for  Example  ii.  we  find 

!-°Tf-»"- 

The  distribution  of  statures,  Example  iii.,  gives  the  ratio  0-68. 
The  short  series  of  wage  statistics  in  Example  i.  could  not  be 
expected  to  give  a  result  in  very  strict  conformity  with  the 
rule,  but  the  actual  ratio,  viz.  0*61,  does  not  diverge  greatly. 
It  follows  from  this  ratio  that  a  range  of  nine  times  the  semi- 
interquartile  range,  approximately,  is  required  to  cover  the  same 
proportion  of  the  total  frequency  (99  per  cent,  or  more)  as  a  range 
of  six  times  the  standard  deviation. 

24.  Of  the  three  measures  of  dispersion,  the  semi-interquartile 
range  has  the  most  clear  and  simple  meaning.  It  is  calculated, 
like  the  median,  with  great  ease,  and  the  quartiles  may  be  found, 
if  necessary,  by  measuring  two  individuals  only.  If,  e.g.^  the 
dispersion  as  well  as  the  average  stature  of  a  group  of  men 
is  required  to  be  determined  with  the  least  possible  expenditure 
of  time,  they  may  be  simply  ranked  in  order  of  height,  and  the 
three  men  picked  out  for  measurement  who  stand  in  the  centre 
and  one-quarter  from  either  end  of  the  rank.  This  measure  of 
dispersion  may  also  be  useful  as  a  makeshift  if  the  calculation 
of  the  standard  deviation  has  been  rendered  difficult  or  impossible 
owing  to  the  employment  of  an  irregular  classification  of  the 
frequency  or  of  an  indefinite  terminal  class.  Such  uses  are, 
however,   a   little   exceptional,   and,   generally   speaking,  the 
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semi-interquartile  range  as  a  measure  of  dispersion  is  not  to  be 
recommended,  unless  simplicity  of  meaning  is  of  primary  im- 
portance, owing  to  the  lack  of  algebraical  convenience  which 
it  shares  with  the  median.  Further,  it  is  obvious  that  the 
quarfcile,  like  the  median,  may  become  indeterminate,  and  that 
the  use  of  this  measure  of  dispersion  is  undesirable  in  cases  of 
discontinuous  variation  :  the  student  should  refer  again  to  the 
discussion  of  the  similar  disadvantage  in  the  case  of  the  median. 
Chap.  VII.  §  14,  It  has,  however,  been  largely  used  in  the  past, 
particularly  for  anthropometric  work. 

25.  Measures  of  Relative  Dispersion. — As  was  pointed  out  in 
Chapter  VII.  §  26,  if  relative  size  is  regarded  as  influencing  not  only 
the  average,  but  also  deviations  from  the  average,  the  geometric 
mean  seems  the  natural  form  of  average  to  use,  and  deviations 
should  be  measured  by  their  ratios  to  the  geometric  mean.  As 
already  stated,  however,  this  method  of  measuring  deviations,  with 
its  accompanying  employment  of  the  geometric  mean,  has  never 
come  into  general  use.  It  is  a  much  more  simple  matter  to  allow 
for  the  influence  of  size  by  taking  the  ratio  of  the  measure  of 
absolute  dispersion  {e.g.  standard  deviation,  mean  deviation,  or 
quartile  deviation)  to  the  average  (mean  or  median)  from  which 
the  deviations  were  measured.    Pearson  has  termed  the  quantity 

i.e.  the  percentage  ratio  of  the  standard  deviation  to  the  arithmetic 
mean,  the  coefficient  of  variation  (ref.  7),  and  has  used  it,  for 
example,  in  comparing  the  relative  variations  of  corresponding 
organs  or  characters  in  the  two  sexes  :  the  ratio  of  the  quartile 
deviation  to  the  median  has  also  been  suggested  (VerschaefFelt, 
ref.  8).  Such  a  measure  of  relative  dispersion  is  evidently  a  mere 
number,  and  its  magnitude  is  independent  of  the  units  of 
measurement  employed. 

26.  Measures  of  Asymmetry  or  Skewness. — If  we  have  to  compare 
a  series  of  distributions  of  varying  degrees  of  asymmetry,  or  skew 
ness,  as  Pearson  has  termed  it,  some  numerical  measure  of  this 
character  is  desirable.  Such  a  measure  of  skewness  should 
obviously  be  independent  of  the  units  in  which  we  measure  the 
variable — e.g.  the  skewness  of  the  distribution  of  the  weights  of  a 
given  set  of  men  should  not  be  dependent  on  our  choice  of  the 
pound,  the  stone,  or  the  kilogramme  as  the  unit  of  weight — and 
the  measure  should  accordingly  be  a  mere  number.  Thus  the 
difi^erence  between  the  deviations  of  the  two  quartiles  on  either 
side  of  the  median  indicates  the  existence  of  skewness,  but  to 
measure  the  degree  of  skewness  we  should  take  the  ratio  of  this 
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difference  to  some  quantity  of  the  same  dimensions,  e.g.  the  semi- 
interquartile  range.  Our  measure  would  then  be,  taking  the 
skewness  to  be  positive  if  the  longer  tail  of  the  distribution  runs 
in  the  direction  of  high  values  of  X, 

skewness  =  («i^^a^(^^)  =  «l±%:M^-  .  (n) 

This  would  not  be  a  bad  measure  if  we  were  using  the  quartile 
deviation  as  a  measure  of  dispersion  :  its  lowest  value  is  zero, 
when  the  distribution  is  symmetrical ;  and  while  its  highest  possible 
value  is  2,  it  would  rarely  in  practice  attain  higher  numerical 
values  than  ±1.  A  similar  measure  might  be  based  on  the  mean 
deviations  in  excess  and  in  defect  of  the  mean.  There  is,  however, 
only  one  generally  recognised  measure  of  skewness,  and  that  is 
Pearson's  measure  (ref.  9) — 

,  mean  -  mode  ^, 

skewness  =  - — — t—, — —        .       .  (12) 
standard  deviation  ' 

This  is  evidently  zero  for  a  symmetrical  distribution,  in  which 
mode  and  mean  coincide.  No  upper  limit  to  the  ratio  is  apparent 
from  the  formula,  but,  as  a  fact,  the  value  does  not  exceed  unity  for 
frequency-distributions  resembling  generally  the  ideal  distributions 
of  fig.  9.  As  the  mode  is  a  difficult  form  of  average  to  determine 
by  elementary  methods,  it  may  be  noted  that  the  numerator  of  the 
above  fraction  may,  in  the  case  of  frequency-distributions  of  the 
forms  referred  to,  be  replaced  approximately  by  3(mean  -  median), 
(c/.  Chap.  VII.  §20).  The  measure  (12)  is  much  more  sensitive 
than  (11)  for  moderate  degrees  of  asymmetry, 

27.  The  Method  of  Percentiles. — We  may  conclude  this  chapter 
by  describing  briefly  a  method  that  has  been  largely  used  in  the 
past  in  lieu  of  the  methods  dealt  with  in  Chapters  VI.  and  VII., 
and  the  preceding  paragraphs  of  this  chapter,  for  summarising 
such  statistics  as  we  have  been  considering.  If  the  values  of  the 
variable  (variates,  as  they  are  sometimes  termed)  be  ranged  in 
order  of  magnitude,  and  a  value  P  of  the  variable  be  determined 
such  that  a  percentage  p  of  the  total  frequency  lies  below  it  and 
100  -jo  above,  then  P  is  termed  a  percentile.  If  a  series  of  per- 
centiles be  determined  for  short  intervals,  e.g.  5  per  cent,  or  10 
per  cent.,  they  suffice  by  themselves  to  show  the  general  form 
of  the  distribution.  This  is  Sir  Francis  Galton's  method  of 
percentiles.  The  deciles,  or  values  of  the  variable  which  divide 
the  total  frequency  into  ten  equal  parts,  form  a  natural  and 
convenient  series  of  percentiles  to  use.  The  fifth  decile,  or  value 
of  the  variable  which  has  50  per  cent,  of  the  observed  values 
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above  it  and  50  per  cent,  below,  is  the  median :  the  two  quartiles 
lie  between  the  second  and  third  and  the  seventh  and  eighth 
deciles  respectively. 

28.  The  deciles,  like  the  median  and  quartiles,  may  be 
determined  either  by  arithmetical  or  by  graphical  interpolation, 
excluding  the  cases  in  which,  like  the  former  constants,  they 
become  indeterminate  {cf.  §  24).  It  is  hardly  necessary  to  give 
an  illustration  of  the  former  process,  as  the  method  is  precisely 
the  same  as  for  median  and  quartiles  (Chap.  VII.  §  15,  and  above, 
§  22).    Fig.  26  shows,  of  course  on  a  very  much  reduced  scale,  the 
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Percentage  of  the  populalLon, 
uv  receipt  of  relief 

Fio.  26. — Curve  showing  the  number  of  Districts  of  England  and  Wales  in 
which  the  Pauperism  on  1st  January  1891  did  not  exceed  any  given  per- 
centage of  the  po{)ulation  (same  data  as  Fig.  10,  p.  92) :  graphical 
determination  of  Deciles. 

curve  used  for  obtaining  the  deciles  by  the  graphical  method  in 
the  case  of  the  distribution  of  pauperism  (Example  ii.  above). 
The  figures  of  the  original  table  are  added  up  step  by  step  from 
the  top,  so  as  to  give  the  total  frequency  not  exceeding  the  upper 
limit  of  each  class-interval,  and  ordinates  are  then  erected  to  a 
horizontal  base  to  represent  on  some  scale  these  integrated 
frequencies :  a  smooth  curve  is  then  drawn  through  the  tops  of 
the  ordinates  so  obtained.  This  curve,  as  will  be  seen  from  the 
figure,  rises  slowly  at  first  when  the  frequencies  are  small,  then 
more  rapidly  as  they  increase,  and  finally  turns  over  again  and 
becomes  quite  flat  as  the  frequencies  tail  off  to  zero.    The  deciles 
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may  be  readily  obtained  from  such  a  curve  by  dividing  the 
terminal  ordinate  into  ten  equal  parts,  and  projecting  the  points 
so  obtained  horizontally  across  to  the  curve  and  then  vertically 
down  to  the  base.  The  construction  is  indicated  on  the  figure  for 
the  fourth  decile,  the  value  of  which  is  approximately  2*88  per  cent. 

29.  The  curve  of  fig.  26  may  be  drawn  in  a  different  way  by 
taking  a  horizontal  base  divided  into  ten  or  a  hundred  equal 
parts  (grades,  as  Sir  Francis  Galton  has  termed  them),  and  erecting 
at  each  point  so  obtained  a  vertical  proportional  to  the  cor- 
responding percentile.  This  gives  the  curve  of  fig.  27,  which  was 
obtained  by  merely  redrafting  fig.  26.    The  curve  is  of  so-called 
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Fig.  27. — The  curve  of  Fig.  26  redrawn  so  as  to  give  the  Pauperism 
corresponding  to  each  grade  :  Gal  ton's  "  Ogive." 

Ogive  form.  The  ogive  curve  for  the  distribution  of  statures 
(Example  iii.)  is  shown  for  comparison  in  fig.  28.  It  will  be  noticed 
that  the  ogive  curve  does  not  bring  out  the  asymmetry  of  the 
distribution  of  pauperism  nearly  so  clearly  as  the  frequency- 
polygon,  fig.  10,  p.  92. 

30.  The  method  of  percentiles  has  some  advantages  as  a  method 
of  representation,  as  the  meaning  of  the  various  percentiles  is  so 
simple  and  readily  understood.  An  extension  of  the  method  to 
the  treatment  of  non-measurable  characters  has  also  become  of 
some  importance.  For  example,  the  capacity  of  the  diff'erent  boys 
in  a  class  as  regards  some  school  subject  cannot  be  directly 
measured,  but  it  may  not  be  very  difficult  for  the  master  to 
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arrange  them  in  order  of  merit  as  regards  this  character :  if  the 
boys  are  then  "  numbered  up  "  in  order,  the  number  of  each  boy, 
or  his  rank,  serves  as  some  sort  of  index  to  his  capacity  (c/.  the 
remarks  in  §  12.  It  should  be  noted  that  rank  in  this  sense  is 
not  quite  the  same  as  grade ;  if  a  boy  is  tenth,  say,  from  the 
bottom  in  a  class  of  a  hundred  his  grade  is  9*5,  but  the  method 
is  in  principle  the  same  with  that  of  grades  or  percentiles). 
The  method  of  ranks,  grades,  or  percentiles  in  such  a  case  may 
be  a  very  serviceable  auxiliary,  though,  of  course,  it  is  better  if 
possible  to  obtain  a  numerical  measure.  But  if,  in  the  case  of  a 
measurable  character,  the  percentiles  are  used  not  merely  as 
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Fio.  28.— Ogive  Curve  for  Stature,  same  data  as  Fig.  6,  p.  89. 
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constants  illustrative  of  certain  aspects  of  the  frequency-distribu- 
tion, but  entirely  to  replace  the  table  giving  the  frequency- 
distribution,  serious  inconvenience  may  be  caused,  as  the 
application  of  other  methods  to  the  data  is  barred.  Given  the 
table  showing  the  frequency-distribution,  the  reader  can  calculate 
not  only  the  percentiles,  but  any  form  of  average  or  measure  of 
dispersion  that  has  yet  been  proposed,  to  a  sufficiently  high 
degree  of  approximation.  But  given  only  the  percentiles,  or  at 
least  so  few  of  them  as  the  nine  deciles,  he  cannot  pass  back  to 
the  frequency  distribution,  and  thence  to  other  constants,  with  any 
degree  of  accuracy.  In  all  cases  of  published  work,  therefore, 
the  figures  of  the  frequency-distribution  should  be  given  ;  they 
are  absolutely  fundamental. 
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EXERCISES. 


1.  Verify  the  following  from  the  data  of  Table  VI.,  Chap.  VI.,  continuing 
the  work  from  the  stage  reached  for  Qu.  1,  Chap.  VII. 


OLdtUlC  IXJ 

Inches  for  Adult  Males 

Will  111"^ 

England. 

Scotland. 

Wales. 

Ireland. 

Standard  deviation  . 

2-56 

2-50 

2-35 

2-17 

Mean  deviation  . 

2-05 

1-95 

1-82 

1-69 

Quartile  deviation 

178 

1-56 

1-46 

1-35 

Mean   deviation  /  standard 

0-80 

078 

0-78 

0-78 

deviation 

Quartile  deviation/standard 

0-69 

0-62 

0-62 

0-62 

deviation 

Lower  quartile  . 

65-55 

66-92 

65-06 

66-39 

Upper     ,,      .       .  . 

69-10 

70-04 

67  98 

69-10 

2.  (Continuing  from  Qu.  2,  Chap.  VII.)  Find  the  standard  deviation, 
mean  deviation,  quartiles  and  quartile  deviation  (or  semi-interquartile  range) 
for  the  distribution  of  weights  of  adult  males  in  the  United  Kingdom  given  in 
the  last  column  of  Table  IX.,  Chap.  VI. 

Compare  the  ratios  of  the  mean  and  quartile  deviations  to  the  standard 
deviation  with  the  ratios  stated  in  §§  19  and  23  to  be  usual. 

Find  the  value  of  the  skewness  (equation  12),  using  the  approximate  value 
of  the  mode. 

3.  Using,  or  extending  if  necessary,  your  diagram  for  Question  4,  Chap.  VII. , 
find  the  quartile  values  for  houses  assessed  to  inhabited  house  duty  in  1885-6, 
from  the  data  of  Table  IV.,  Chap.  VI. 

Find  also  the  9th  decile  (the  value  exceeded  by  10  per  cent,  of  the  houses 
only). 

4.  Verify  equation  (9)  by  direct  calculation  of  the  standard  deviation  of  the 
numbers  1  to  10. 

5.  (Data  from  Sauerbeck,  Jour.  Hoy.  Stat.  Sac,  March  1909.)  The 
following  are  the  index-numbers  (percentages)  of  prices  of  45  commodities  in 
1908  on  their  average  prices  in  ihe  years  1867-77  :— 40,  43,  43,  46,  46,  46, 
54,  56,  59,  62,  64,  64,  66,  66,  67,  67,  68,  68,  69,  69,  69,  71,  75,  75,  76,  76, 
78,  80,  82,  82,  82,  82,  82,  83,  84,  86,  88,  90,  90,  91,  91,  92,  95,  102,  127. 
Find  the  mean  and  standard  deviation  (1)  without  further  grouping  ;  (2) 
grouping  the  numbers  by  fives  (40-,  45-,  50-,  etc. ) ;  (3)  grouping  by  tens  (40-, 
50-,  60-,  etc.). 

6.  (Continuing  from  Qu.  8,  Chap.  VII.)  Supposing  the  frequencies  of 
values  0,  1,  2,  3,  .  .  .  of  a  variable  to  be  given  by  the  terms  of  the  binomial 
series 

where  jt7 -f  7  =  1 ,  find  the  standard  deviation. 

7.  {Cf.  the  remarks  at  the  end  of  §  17.)  The  sum  of  the  deviations  (with; 
out  regard  to  sign)  about  the  centre  of  the  class-interval  containing  the  mean 
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(or  median),  in  a  grouped  frequency-distribution,  is  found  to  be  S.  Find  the 
coirection  to  be  ap])lied  to  this  sum,  in  order  to  reduce  it  to  the  mean  (or 
median)  as  origin,  on  the  assumption  that  the  observations  are  evenly  dis- 
tributed over  each  class-interval.  Take  the  number  of  observations  below  the 
interval  containing  the  mean  (or  median)  to  be  Wj,  in  that  interval  7i.2> 
above  it  %  ;  and  the  distance  of  the  mean  (or  median)  from  the  arbitrary 
origin  to  be  c?. 

Show  that  the  values  of  the  mean  deviation  (from  the  mean  and  from  the 
median  respectively)  for  Example  ii.,  found  by  the  use  of  this  formula,  do  not 
differ  from  the  values  found  by  the  simpler  method  of  §§  16  and  17  in  the 
second  place  of  decimals. 

8.  (W.  Scheibner,  "  Ueber  Mittelwerthe,"  Berichte  der  kgl,  sdchsischen 
Gesellschaft  d.  Wissenschaften,  1873,  p.  564,  cited  by  Fechner,  ref.  2  of 
Chap.  VII.  :  the  second  form  of  the  relation  is  given  by  G.  Duncker  {Die 
Methode  der  Variationsstatistik  ;  Leipzig,  1899)  as  an  empirical  one.)  Show 
that  if  deviations  aro  small  compared  with  the  mean,  so  that  [xjMf  and 
higher  powers  of  aj/if  may  be  neglected,  we  have  approximately  the  relation 


where  G  is  the  geometric  mean,  M  the  arithmetic  mean,  and  <r  the  standard 
deviation  :  and  consequently  to  the  same  degree  of  approximation      -  QP'  =  a^. 

9.  (Scheibner,  loc.  cit.,  Qu.  8.)  Similarly,  show  that  if  deviations  are  small 
compared  with  the  mean,  we  have  approximately 


H  being  the  hannonic  mean. 


CHAPTER  IX. 


CORRELATION. 

1-3.  The  correlation  table  and  its  formation— 4-5.  The  correlation  surface— 
6-7.  The  general  problem — 8-9.  The  line  of  means  of  rows  and  the 
line  of  means  of  columns :  their  relative  positions  in  the  case  of 
independence  and  of  varying  degrees  of  correlation — 10-14.  The 
correlation  coefficient,  the  regressions,  and  the  standard-deviations  of 
arrays— 15-16.  Numerical  calculations— 17.  Certain  points  to  be 
remembered  in  calculating  and  using  the  coefficient. 

1.  In  chapters  VI. -VIII.  we  considered  the  frequency-distribu- 
tion of  a  single  variable,  and  the  more  important  constants 
that  may  be  calculated  to  describe  certain  characters  of  such 
distributions.  We  have  now  to  proceed  to  the  case  of  two 
variables,  and  the  consideration  of  the  relations  between  them. 

2.  If  the  corresponding  values  of  two  variables  be  noted 
together,  the  methods  of  classification  employed  in  the  preceding 
chapters  may  be  applied  to  both,  and  a  table  of  double  entry  or 
contingency- table  (Chap.  V.)  be  formed,  exhibiting  the  frequencies 
of  pairs  of  values  lying  within  given  class-intervals.  Six  such 
tables  are  given  below  as  illustrations  for  the  following 
variables: — Table  I.,  two  measurements  on  a  shell  (Pecten). 
Table  II.,  ages  of  husbands  and  wives  in  England  and  Wales  in 
1901.  Table  III.,  statures  of  fathers  and  their  sons  (British). 
Table  IV.,  fertility  of  mothers  and  their  daughters  (British 
peerage).  Table  V,,  the  rate  of  discount  and  the  ratio  of  reserves 
to  deposits  in  American  banks.  Table  VI.,  the  proportion  of 
male  to  total  births,  and  the  total  numbers  of  births,  in  the 
registration  districts  of  England  and  Wales. 

Each  row  in  such  a  table  gives  the  frequency-distribution  of 
the  first  variable  for  cases  in  which  the  second  variable  lies 
within  the  limits  stated  on  the  left  of  the  row.  Similarly,  every 
column  gives  the  frequency-distribution  of  the  second  variable 
for  cases  in  which  the  value  of  the  first  variable  lies  within  the 
limits  stated  at  the  head  of  the  column.  As  "  columns "  and 
"rows"  are  distinguished  only  by  the  accidental  circumstance 
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of  the  one  set  running  vertically  and  the  other  horizontally,  and 
the  difference  has  no  statistical  significance,  the  word  array 
has  been  suggested  as  a  convenient  term  to  denote  either  a  row 
or  a  column.  If  the  values  of  X  in  one  array  are  associated 
with  values  of  Y  between  the  limits  r„  -  S  and  r„  +  8,  may  be 
termed  the  type  of  the  array.  (Pearson,  ref.  6.)  The  special 
kind  of  contingency  tables  with  which  we  are  now  concerned 
are  called  correlation  tables,  to  distinguish  them  from  tables 
based  on  unmeasured  qualities  and  so  forth. 

3.  Nothing  need  be  added  to  what  was  said  in  Chapter  VI.  as 
regards  the  choice  of  magnitude  and  position  of  class-intervals. 
When  these  have  been  fixed,  the  table  is  readily  compiled  by 
taking  a  large  sheet  ruled  with  rows  and  columns  properly 
headed  in  the  same  way  as  the  final  table  and  entering  a  dot, 
stroke,  or  small  cross  in  the  corresponding  compartment  for  each 
pair  of  recorded  observations.  If  facility  of  checking  be  of 
great  importance,  each  pair  of  recorded  values  may  be  entered 
on  a  separate  card  and  these  dealt  into  little  packs  on  a  board 
ruled  in  squares,  or  into  a  divided  tray ;  each  pack  can  then  be 
run  through  to  see  that  no  card  has  been  mis-sorted.  The 
difficulty  as  to  the  intermediate  observations — values  of  the 
variables  corresponding  to  divisions  between  class-intervals — will 
be  met  in  the  same  way  as  before  if  the  value  of  one  variable 
alone  be  intermediate,  the  unit  of  frequency  being  divided 
between  two  adjacent  compartments.  If  both  values  of  the  pair 
be  intermediates,  the  observation  must  be  divided  between  four 
adjacent  compartments,  and  thus  quarters  as  well  as  halves  may 
occur  in  the  table,  as,  e.g.^  in  Table  III.  In  this  case  the  statures 
of  fathers  and  sons  were  measured  to  the  nearest  quarter- 
inch  and  subsequently  grouped  by  1-inch  intervals :  a  pair  in 
which  the  recorded  stature  of  the  father  is  60 "5  in.  and  that  of 
the  son  62-5  in.  is  accordingly  entered  as  0*25  to  each  of  the 
four  compartments  under  the  columns  59*5-60-5,  60-5-6r5,  and 
the  rows  61  "5-62 "5,  62 •5-63 '5.  Workers  will  generally  form 
their  own  methods  for  entering  such  fractional  frequencies 
during  the  process  of  compiling,  but  one  convenient  method  is 
to  use  a  small  x  to  denote  a  unit  and  a  dot  for  a  quarter ;  the 
four  dots  should  be  placed  in  the  position  of  the  four  points 
of  the  X  and  joined  when  complete.  It  is  best  to  choose  the 
limits  of  class- intervals,  where  possible,  in  such  a  way  as  to  avoid 
fractional  frequencies. 

4.  The  distribution  of  frequency  for  two  variables  may  be 
represented  by  a  surface  or  solid  in  the  same  way  as  the  frequency- 
distribution  of  a  single  variable  may  be  represented  by  a  plane 
figure.    We  may  imagine  the  surface  to  be  obtained  by  erecting 
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at  the  centre  of  every  compartment  of  the  correlation-table  a 
vertical  of  length  proportionate  to  the  frequency  in  that  com- 
partment, and  joining  up  the  tops  of  the  verticals.  If  the 
compartments  were  made  smaller  and  smaller  while  the  class- 
frequencies  remained  finite,  the  irregular  figure  so  obtained  would 
approximate  more  and  more  closely  towards  a  continuous  curved 
surface — a  frequency-surface  -corresponding  to  the  frequency- 
curves  for  single  variables  of  Chapter  VI.  The  volume  of  the 
frequency-solid  over  any  area  drawn  on  its  base  gives  the 
frequency  of  pairs  of  values  falling  within  that  area,  just  as  the 
area  of  the  frequency-curve  over  any  interval  of  the  base-line  gives 
the  frequency  of  observations  within  that  interval.  Models  of 
actual  distributions  may  be  constructed  by  drawing  the  frequency- 
distributions  for  all  arrays  of  the  one  variable,  to  the  same  scale, 
on  sheets  of  cardboard,  and  erecting  the  cards  vertically  on  a 
base-board  at  equal  distances  apart,  or  by  marking  out  a  base- 
board in  squares  corresponding  to  the  compartments  of  the 
correlation-table,  and  erecting  on  each  square  a  rod  of  wood  of 
height  proportionate  to  the  frequency.  Such  solid  representations 
of  frequency-distributions  for  tw^o  variables  are  sometimes  termed 
stereograms. 

5.  It  is  impossible,  however,  to  group  the  majority  of 
frequency-surfaces,  in  the  same  way  as  the  frequency-curves, 
under  a  few  simple  types  :  the  forms  are  too  varied.  The  simplest 
ideal  type  is  one  in  which  every  section  of  the  surface  is  a  sym- 
metrical curve — the  first  type  of  Chap.  VI.  (fig.  5,  p.  89).  Like 
the  symmetrical  distribution  for  the  single  variable,  this  is  a  very 
rare  form  of  distribution  in  economic  statistics,  but  approximate 
illustrations  may  be  drawn  from  anthropometry.  Fig.  29  shows 
the  ideal  form  of  the  surface,  somewhat  truncated,  and  fig. 
30  the  distribution  of  Table  III.,  which  approximates  to  the  same 
type, — the  difference  in  steepness  is,  of  course,  merely  a  matter  of 
scale.  The  maximum  frequency  occurs  in  the  centre  of  the 
whole  distribution,  and  the  surface  is  symmetrical  round  the 
vertical  through  the  maximum,  equal  frequencies  occurring  at 
equal  distances  from  the  mode  on  opposite  sides.  The  next 
simplest  type  of  surface  corresponds  to  the  second  type  of 
frequency-curve — the  moderately  asymmetrical.  Most,  if  not  all, 
of  the  distributions  of  arrays  are  asymmetrical,  and  like  the  dis- 
tribution of  fig.  9,  p.  92  :  the  surface  is  consequently  asymmetrical, 
and  the  maximum  does  not  lie  in  the  centre  of  the  distribution. 
This  form  is  fairly  common,  and  illustrations  might  be  drawn 
from  a  variety  of  sources — economics,  meteorology,  anthropometry, 
etc.  The  data  of  Table  II.  will  serve  as  an  example.  The  total 
distributions  and  the  distributions  of  the  majority  of  the  arrays 
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.—Frequency  Surface  for  the  Rate  of  Discount  and  Ratio  of  Reserves  to  Deposits  in  American  Banks  (data  of  Table  V.). 
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are  asymmetrical,  the  skewness  being  positive  for  the  rows  at 
the  top  of  the  table  (the  mode  being  lower  than  the  mean),  and 
negative  for  the  rows  at  the  foot,  the  more  central  rows  being 
nearly  symmetrical.  The  maximum  frequency  lies  towards  the 
upper  end  of  the  table  in  the  compartment  under  the  row  and 
column  headed  "  30  - The  frequency  falls  off  very  rapidly 
towards  the  lower  ages,  and  slowly  in  the  direction  of  old  age. 
Outside  these  two  forms,  it  seems  impossible  to  delimit  empirically 
any  simple  types.  Tables  V.  and  VI.  are  given  simply  as  illus- 
trations of  two  very  divergent  forms.  Fig.  31  gives  a  graphical 
representation  of  the  former  by  the  method  corresponding  to  the 
histogram  of  Chapter  VL,  the  frequency  in  each  compartment 
being  represented  by  a  square  pillar.  The  distribution  of 
frequency  is  very  characteristic,  and  quite  different  from  that 
of  any  of  the  Tables  I.,  II.,  III.,  or  IV. 

6,  It  is  clear  that  such  tables  may  be  treated  by  any  of  the 
methods  discussed  in  Chapter  V.,  which  are  applicable  to  all 
contingency-tables,  however  formed.  The  distribution  may  be 
investigated  in  detail  by  such  methods  as  those  of  §  4,  or  tested 
for  isotropy  (§  11),  or  the  coefficient  of  contingency  can  be 
calculated  (§§  5-8).  In  applying  any  of  these  methods,  however, 
it  is  desirable  to  use  a  coarser  classification  than  is  suited  to  the 
methods  to  be  presently  discussed,  and  it  is  not  necessary  to 
retain  the  constancy  of  the  class-interval.  The  classification 
should,  on  the  contrary,  be  arranged  simply  with  a  view  to  avoiding 
many  scattered  units  or  very  small  frequencies.  A  few  examples 
should  be  worked  as  exercises  by  the  student  (Question  3). 

7.  But  the  coefficient  of  contingency  merely  tells  us  whether, 
and  if  so,  how  closely,  the  two  variables  are  related,  and  much 
more  information  than  this  can  be  obtained  from  the  correlation- 
table,  seeing  that  the  measures  of  Chapters  VII.  and  VIII.  can  be 
applied  to  the  arrays  as  well  as  to  the  total  distributions.  If  the 
two  variables  are  independent,  the  distributions  of  all  parallel 
arrays  are  similar  (Chap.  V.  §  13);  hence  their  averages  and 
dispersions,  e.g.  means  and  standard  deviations,  must  be  the  same. 
In  general  they  are  not  the  same,  and  the  relation  between  the 
mean  or  standard  deviation  of  the  array  and  its  type  requires 
investigation.  Of  the  two  constants,  the  mean  is,  in  general,  the 
more  important,  and  our  attention  will  for  the  present  be  con- 
fined to  it.  The  majority  of  the  questions  of  practical  statistics 
relate  solely  to  averages  :  the  most  important  and  fundamental 
question  is  whether,  on  an  average,  high  values  of  the  one  variable 
show  any  tendency  to  be  associated  with  high  (or  with  low) 
values  of  the  other.  If  possible,  we  also  desire  to  know  how  great  a 
divergence  of  the  one  variable  from  its  average  value  is  associated 
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with  a  unit  divergence  of  the  other,  and  to  obtain  some  idea  as  to 
the  closeness  with  which  this  relation  is  usually  fulfilled. 

8.  Suppose  a  diagram  (fig.  32)  to  be  drawn  representing  the 
values  of  means  of  arrays.  Let  OX,  OF  be  the  scales  of  the  two 
variables,  i.e.  the  scales  at  the  head  and  side  of  the  table,  01,  12, 
etc.,  being  successive  class-intervals.  Let  be  the  mean  value 
of  X,  and  the  mean  value  of  Y.  If  the  two  variables  be 
absolutely  independent,  the  distributions  of  frequency  in  all 
parallel  arrays  are  similar  (Chap.  V.  §  13),  and  the  means  of  arrays 
must  lie  on  the  vertical  and  horizontal  lines  ifji/,  M^My  the 
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Fig.  32. 


small  circles  denoting  means  of  rows  and  the  small  crosses  means 
of  columns.  (In  any  actual  case,  of  course,  the  means  would  not 
lie  so  regularly,  but,  if  the  independence  were  almost  complete, 
would  only  fluctuate  slightly  to  the  one  side  and  the  other  of  the 
two  lines.) 

The  cases  with  which  the  experimentalist,  e.g.  the  chemist  or 
physicist,  has  to  deal,  where  the  observations  are  all  crowded 
closely  round  a  single  line,  lie  at  the  opposite  extreme  from 
independence.  The  entries  fall  into  a  few  compartments  only  of 
each  array,  and  the  means  of  rows  and  of  columns  lie  approximately 
on  one  and  the  same  curve,  like  the  line  RR  of  fig.  33. 

The  ordinary  cases  of  statistics  are  intermediate  between  these 
two  extremes,  the  lines  of  means  being  neither  at  right  angles  as 
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in  fig.  32,  nor  coincident  as  in  fig.  33,  but  standing  at  an  acute 
angle  with  one  another  as  RR  (means  of  rows)  and  CG  (means  of 
columns)  in  figs.  36-8.  The  complete  problem  of  the  statistician, 
like  that  of  the  physicist,  is  to  find  formulre  or  equations  which 
will  suffice  to  describe  approximately  these  curves. 

9.  In  the  general  case  this  may  be  a  difficult  problem,  but,  in 
the  first  place,  it  often  suffices,  as  already  pointed  out,  to  know 
merely  whether  on  an  average  high  values  of  the  one  variable 
show  any  tendency  to  be  associated  with  high  or  with  low  values 
of  the  other,  a  purpose  which  will  be  served  very  fairly  by  fitting  a 
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straight  line ;  and  further,  in  a  large  number  of  cases,  it  is  found 
either  (1)  that  the  moans  of  arrays  lie  very  approximately  round 
straight  lines,  or  (2)  that  they  lie  so  irregularly  (possibly  owing 
only  to  paucity  of  observations)  that  the  real  nature  of  the  curve 
is  not  clearly  indicated,  and  a  straight  line  will  do  almost  as  well 
as  any  more  elaborate  curve.  {Of.  figs.  36-38.)  In  such  cases 
— and  they  are  relatively  more  frequent  than  might  be  supposed 
— the  fitting  of  straight  lines  to  the  means  of  arrays  determines 
all  the  most  important  characters  of  the  distribution.  We  might 
fit  such  lines  by  a  simple  graphical  method,  plotting  the  points 
representing  means  of  arrays  on  a  diagram  like  those  of  figures 
36-38,  and  "  fitting  "  lines  to  them,  say,  by  means  of  a  stretched 
black  thread  shifted  about  till  it  appeared  to  run  as  near  as 
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might  be  to  all  the  points.  But  such  a  method  is  hardly  satis- 
factory, more  especially  if  the  points  are  somewhat  scattered ;  it 
leaves  too  much  room  for  guesswork,  and  different  observers  obtain 
very  diflterent  results.  Some  method  is  clearly  required  which 
will  enable  the  observer  to  determine  equations  to  the  two  lines 
for  a  given  distribution,  however  irregularly  the  means  may  lie, 
as  simply  and  definitely  as  he  can  calculate  the  means  and 
standard  deviations, 

10.  Consider  the  simplest  case  in  which  the  means  of  rows  lie 


exactly  on  a  straight  line  RR  (fig.  34).  Let  be  the  mean 
value  of  Y,  and  let  RR  cut  M^x,  the  horizontal  through  J/2,  in  M. 
Then  it  may  be  shown  that  the  vertical  through  M  must  cut  OX 
in  Jfp  the  mean  of  X.  For,  let  the  slope  of  RR  to  the  vertical, 
i.e.  the  tangent  of  the  angle  M^MR  or  ratio  of  Id  to  IM,  be  6^, 
and  let  deviations  from  My,  Mx  be  denoted  by  x  and  y.  Then  for 
any  one  row  of  type  y  in  which  the  number  of  observations  is  n, 
%{x)  =  n.h^y,  and  therefore  for  the  whole  table,  since  2(ny)  =  0, 
%{x)  =  h^^{ny)  =  0.  i/j  must  therefore  be  the  mean  of  X,  and 
M  may  accordingly  be  termed  the  mean  of  the  whole  distribution. 
Knowing  that  RR  passes  through  J/,  it  remains  only  to  determine 
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by    This  may  conveniently  be  done  in  terms  of  the  mean  product 
p  of  all  pairs  of  associated  deviations  x  and      i.e. — 

p  =  is(.ry).       .       .       .       .    (1)  t/ 

For  any  one  row  we  have 

%{xi/)  =  i/^x)  =  7i.b^?/'^. 
Therefore  for  the  whole  table 

'^{xy)  =  b,^{vf)  =  M,.al, 

^  =  ^2  (2) 

Similarly,  if  CC  be  the  line  on  which  lie  the  means  of  columns 
and  62  its  slope  to  the  horizontal,  rsjsM^ 

*2  =  A  (3)  ^ 

These  two  equations  (2)  and  (3)  are  usually  written  in  a 
slightly  different  form.  Let 

r  =  -^  (4)  y 

Then  ^1==^^'  ^2  =  ''?       •       •       •  (^)/ 


Or  we  may  write  the  equations  to  RR  and  CC — 

x  =  r^.y  y  =  r^.x     .        .       .    (6)  / 

These  equations  may,  of  course,  be  expressed,  if  desired,  in 
terms  of  the  absolute  values  of  the  variables  X  and  Y  instead  of 
the  deviations  x  and  y. 

11.  The  meaning  of  the  above  expressions  when  the  means  of 
rows  and  columns  do  not  lie  exactly  on  straight  lines  is  very 
readily  obtained.  If  the  values  of  x  and  b^y  be  noted  for  all 
pairs  of  associated  deviations,  we  have  for  the  sum  of  the 
squares  of  the  differences,  giving  b^  its  value  from  (5), 

^{x-b,.yf  =  F.^^\{\-r'^)      .        .        .  (7) 
If  6j  be  given  any  other  value,  say  (r  +  S)—,  then 

^x  -  b^yf  =  Ncr,\\  -  r2  +  S^). 
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This  is  necessarily  greater  than  the  value  (7);  hence  -- 6^^)^ 
has  the  lowest  possible  value  when  is  put  equal  to  rcrju-y. 
Further,  for  any  one  row  in  which  the  number  of  observations 
is  71,  the  deviation  of  the  mean  of  the  row  from  RR  is  d  (fig.  35), 
and  the  standard  deviation  is  s^^,  ^{x  -  b-^yY  =  nScJ'  +  n.<P.  There- 
fore for  the  whole  table, 

l.{x-b,.yf  =  %(nsj)^^nd'^). 

But  the  first  of  the  two  sums  on  the  right  is  unaffected  by  the 
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slope  or  position  of  RR,  hence,  the  left-hand  side  being  a 
minimum,  the  second  sum  on  the  right  must  be  a  minimum  also. 
That  is  to  say,  when  b^  is  put  eqvxil  to  r  a-Ja-y,  the  sum  of  the  squares 
of  the  distances  of  the  row-means  from  RR,  each  multiplied  by  the 
corresponding  frequency,  is  the  lowest  possible. 

Similar  theorems  hold  good,  of  course,  with  respect  to  the  line 

CG.    If  ^2  be  given  the  value  r       %{x  -  b^.yY  is  a  minimum, 

and  also  2(7i.e^)  (fig.  35).  Hence  we  may  regard  the  equations  (6) 
as  being,  either  (a)  equations  for  estimating  each  individual  x 
from  its  associated  y  (and  y  from  its  associated  x)  in  such  a  way 
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as  to  make  the  sum  of  the  squares  of  the  errors  of  estimate  the 
least  possible ;  or  (b)  equations  for  estimating  the  mean  of  the  ic's 
associated  with  a  given  type  of  p  (and  the  mean  of  the  y's  associated 
with  a  given  type  of  x)  in  such  a  way  as  to  make  the  sum  of  the 
squares  of  the  errors  of  estimate  the  least  possible,  when  every 
mean  is  counted  once  for  each  observation  on  which  it  is  based. 
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Fig.  36.  — Correlation  between  Age  of  Husband  and  Age  of  "Wife  in  England 
and  Wales  (Table  II.):  means  of  rows  shown  by  circles  and  means  of 
columns  by  crosses:  r=  +0"91. 


The  lines  represented  by  the  two  equations  are  thus,  in  a  certain 
natural  sense,  "lines  of  best  fit "  to  the  two  actual  lines  of  means. 

12.  The  constant  r  is  of  very  great  importance.  It  is  evi- 
dently a  pure  number,  and  its  magnitude  is  unaffected  by  the 
scales  in  which  x  and  y  are  measured,  for  these  scales  will 
affect  the  numerator  and  denominator  of  (4)  to  the  same 
extent.  If  the  two  variables  are  independent,  r  is  zero,  for 
and  ^2  are  zero  (c/.  §  8).  The  sign  is  the  sign  of  the  mean 
product  jo,  and  accordingly  r  is  positive  if  large  values  of  x 
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are  associated  with  large  values  of  and  conversely  (as  in 
Tables  I. -IV.),  negative  if  small  values  of  x  are  associated  with 
large  values  of  y  and  conversely  (as  in  Table  V.).  The  numerical 
value  cannot  exceed  ±1,  for  the  sum  of  the  series  of  squares 
in  equation  (7)  is  then  zero  and  the  sum  of  a  series  of  squares 
cannot  be  negative.  If  r=  ±1,  it  follows  that  all  the  observed 
pairs  of  deviations  are  subject  to  the  relation  xjy  =  a-^jay:  this 
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Fig.  37. — Correlation  between  Stature  of  Father  and  Stature  of  Son  (Table 
III.)  :  means  of  rows  shown  by  circles  and  means  of  columns  by  crosses  ; 
r=  +0-51. 


would  be  the  case  if  the  circles  and  crosses  in  such  a  diagram  as 
fig.  33  all  lay  on  one  and  the  same  straight  line.  From  these 
properties  r  is  termed  the  coefficient  of  correlation,  and  the 
expression  (4),  r=plcr^cry=^%{xy)IN'.(T^(Ty,  should  be  remembered. 

It  should  be  noted  that,  while  r  is  zero  if  the  variables  are 
independent,  the  converse  is  not  necessarily  true :  the  fact  that 
r  is  zero  only  implies  that  the  means  of  rows  and  columns 
lie  scattered  round  two  straight   lines    which   do  not  exhibit 
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any  definite  trend,  to  right  or  to  left,  upward  or  downward. 
Two  variables  for  which  r  is  zero  are,  however,  conveniently 
spoken  of  as  uncorrelated.  Table  VI.  and  tig.  39  will  serve  as  an 
illustration  of  a  case  in  which  the  variables  are  almost  uncor- 
related but  by  no  means  independent,  r  being  very  small  (  -  O'Ol  4), 
but  the  coefficient  of  contingency  G  (for  grouping  of  qu.  3)  0-47. 

Figs.  36,  37,  38  are  drawn  from  the  data  of  Tables  II.,  Ill,  and 
IV.,  for  which  r  has  the  values  +  0*91,  4- 0'51,  and  +0-21  respec- 
tively, the  correlation  being  positive  in  each  case.    The  student 
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Fig.  38.— Correlation  between  number  of  a  Mother's  Children  and  number  of 
her  Daughter's  Children  (Table  IV.):  means  of  rows  shown  by  circles 
and  means  of  columns  by  crosses  :  r=  +0'21. 

should  study  such  tables  and  diagrams  closely,  and  endeavour  to 
accustom  himself  to  estimating  the  value  of  r  from  the  general 
appearance  of  the  table. 
13.  The  two  quantities 

6,  =  r—  —  r— 

are  termed  the  coefficients  of  regression,  or  simply  the  regressions, 

6j  being  the  regression  of  x  on  or  deviation  in  x  corresponding 
on  the  average  to  a  unit  change  in  the  type  of  y,  and  being 
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similarly  the  regression  of  y  on  x.  Whilst  the  coefficient  of 
correlation  is  always  a  pure  number,  the  regressions  are  only 
pure  numbers  if  the  two  variables  have  the  same  dimensions,  as 
in  Tables  I. -IV. :  their  magnitudes  depend  on  the  ratio  of  a-Jo-y^  and 
consequently  on  the  units  in  which  x  and  y  are  measured.  They 
are  both  necessarily  of  the  same  sign  (the  sign  of  r).    Since  r  is 
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Fig.  39.  Correlation  between  births  in  a  Registration  District  and  Propor- 
tion of  Male  Births  per  thousand  of  all  births  (England  and  Wales, 
1881-90,  Table  VI.):  means  of  rows  shown  by  circles  and  means 
of  columns  by  crosses  :  r=  -  0*014. 

not  greater  than  unity,  one  at  least  of  the  regressions  must  be 
not  greater  than  unity,  but  the  other  may  be  considerably  greater 
if  the  ratio  a-Ja-y  or  o-^/o-,  be  great.  The  name  regression  arose 
from  the  term  being  first  introduced  in  the  case  of  inheritance  of 
stature  (Galton,  refs.  2,  3).  In  this  case  the  two  standard  devia- 
tions are  very  nearly  equal,  so  that  both  and  are  less  than 
unity,  say  (using  the  more  recent  data  of  Table  III.)  0*50  and  0'52. 
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Hence  the  sons  of  fathers  of  deviation  x  from  the  mean  of  all  fathers 
have  an  average  deviation  of  only  0"52a;  from  the  mean  of  all  sons ; 
i.e.  they  step  back  or  "  regress  "  towards  the  general  mean,  and  0'52 
may  be  termed  the  "  ratio  of  regression."  In  general,  however, 
the  idea  of  a  "stepping  back"  or  "regression"  towards  a  more 
or  less  stationary  mean  is  quite  inapplicable — obviously  so  where 
the  variables  are  different  in  kind,  as  in  Tables  V.  and  VI. — 
and  the  term  "  coefficient  of  regression  "  should  be  regarded  simply 
as  a  convenient  name  for  the  coefficients  and  ^g.  RR  and  GO 
are  generally  termed  the  "  lines  of  regression,"  and  equations  (6) 
the  "  regression  equations."  The  expressions  "  characteristic  lines," 
"  characteristic  equations  "  (Yule,  ref .  8)  would  perhaps  be  better. 
Where  the  actual  means  of  arrays  appear  to  be  given,  to  a  satis- 
factory degree  of  approximation,  by  straight  lines,  we  may  say 
that  the  regression  is  linear.  It  is  not  safe,  however,  to  assume 
that  such  linearity  extends  beyond  the  limits  of  observation. 

14.  The  two  standard  deviations 

are  of  considerable  importance.  It  follows  from  (7)  that  is  the 
standard  deviation  of  (x  —  hyy),  and  similarly  Sy  is  the  standard 
deviation  of  (y  -  b^.x).  Hence  we  may  regard  s^  and  Sy  as  the 
standard  errors  (root  mean  square  errors)  made  in  estimating  x 
from  y  and  y  from  x  by  the  respective  characteristic  relations 

X  =  b^y  y  =  b^,x. 

may  also  be  regarded  as  a  kind  of  average  standard  deviation  of 
a  row  about  BR,  and  Sy  as  an  average  standard  deviation  of  a 
column  about  CC.  In  an  ideal  case,  where  the  regression  is 
truly  linear  and  the  standard  deviations  of  all  parallel  arrays  are 
equal,  a  case  to  which  the  distribution  of  Table  III.  is  a  rough 
approximation,  s^  is  the  standard  deviation  of  the  a;-array  and  Sy 
the  standard  deviation  of  the  ^/-array  (c/.  Chap.  X.  §  19  (3)). 
Hence  s^  and  Sy  are  sometimes  termed  the  "standard  deviations 
of  arrays." 

15.  Proceeding  now  to  the  arithmetical  work,  the  only  new 
expression  that  has  to  be  calculated  in  order  to  determine  r,  b^,  b^, 
Sj,,  and  Sy  is  the  product  sum  %(xy)  or  the  mean  product  jo.  As  in 
the  cases  of  means  and  standard  deviations,  the  form  of  the 
arithmetic  is  slightly  different  according  as  the  observations  are 
few  and  ungrouped,  or  sufficient  to  justify  the  formation  of  a 
correlation-table.  In  the  first  case,  as  in  Example  i.  below,  the 
work  is  quite  straightforward. 

Example  i.,  Table  VII. — The  variables  are  (1)  X — the  estimated 

12 
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Table  VII.  Theory  of  Correlation:  Example  i. 


1. 

2. 

3. 

4. 

5. 

6. 

7. 

8. 

9. 

X. 

Estimated 

Y. 

Percent- 

X. 

V- 

Products  xy. 

Union. 

Average 
Earnings 
of  Agri- 
cultural 
Labourers. 
Shillings 
and  Pence 
per  Week. 

age  of 
Popula- 
tion in 
receipt 

of 
Poor- 
law 
Relief. 

Devia- 

tion 
X  from 
Mean 
(Pence). 

Devia- 
tion 
y  from 
Mean. 

Posi- 
tive. 

Nega- 
tive. 

1.  Glendale  . 

2.  Wigton 

3.  Garstang  . 

4.  Belper 

5.  Nantwich 

6.  Atcham  . 

7.  Driffield  . 

8.  Uttoxeter 

9.  Wetherby 

10.  Easingwold 

11.  Southwell 

12.  Hollingbourn  . 

13.  Melton  Mowbray 

14.  Truro 

15.  Godstone 

16.  Louth 

17.  Brixworth 

18.  Crediton 

19.  Holbeach  , 

20.  Maldon  . 

21.  Monmouth 

22.  St  Neots   .  . 

23.  Swaffham 

24.  Thakehara 

25.  Thame 

26.  Thingoe  . 

27.  Basingstoke  . 

28.  Cirencester 

29.  North  Witchford 

30.  Pewsey 

31.  Bromyard 

32.  Wantage  . 

33.  Stratford  on  Avon 

34.  Dorchester 

35.  Woburn  . 

36.  Buntingford  . 

37.  Pershore  . 

38.  Langport  . 

s.  d. 

20  9 
20  3 
19  8 
18  6 
17  8 
17  6 
17  1 
17  0 
17  0 
16  11 
16  6 
16  4 
16  3 
16  3 
16  0 
16  0 
15  9 
15  8 
15  6 
15  6 
15  4 
15  3 
15  0 
15  0 
15  0 
15  0 
15  0 
15  0 
14  10 
14  9 
14  9 
14  9 
14  7 
14  6 
14  6 
14  4 
13  6 
12  6 

2-40 
2-29 
1-39 

1-  92 

2-  98 
117 

3-  79 
3-01 
2-39 

2-  78 

3-  09 
278 

2-  61 

4-  33 

3-  02 

4-  20 
1-29 
616 
475 
4-64 
4-26 
166 
5  37 
3  38 
6-84 
4-63 

3-  93 

4-  54 

3-  42 
5  "88 

4-  36 

3-  85 
392 

4-  48 
6-67 
4-91 

4-  34 

5-  19 

+58 
+52 
+45 
+31 
+21 
+19 
+  14 
+13 
+  13 
+12 
+  7 
+  5 
+  4 
+  4 
+  1 
+  1 

-  2 

-  3 

-  5 

-  5 

-  8 
-11 
-11 
-11 
-11 
-11 
-11 
-13 

-  14 
-14 
-14 
- 16 
-17 
-17 
-19 
-29 
-41 

-1-27 
-1-38 
-2-28 
-175 
-0  69 
-2-50 
+  0-12 
-0-66 
-1-28 
-0-89 
-0-58 
-0-89 
-1-06 
+0-66 
-0-65 
+0-53 
-2-38 
+1-49 
+1-08 
+0-97 
+0-59 
-2-01 
+  170 
-0-29 
+2-17 
+0-96 
+0-26 
+0-87 
-0-25 
+2"21 
+0-69 
+018 
+0*25 
+0-8] 
+2-00 
+  1-24 
+0-67 
+1-52 

3364 
2704 
2025 
961 
441 
361 
196 
169 
169 
144 
49 
25 
16 
16 
1 
1 
4 
9 
25 
25 
49 
64 
121 
121 
121 
121 
121 
121 
169 
196 
196 
196 
256 
289 
289 
361 
841 
1681 

1-6129 
1-9044 

5-  1984 

3-  0625 
0-4761 

6-  2500 
0-0144 
0-4356 
1  6384 
0-7921 
0-3364 

0-  7921 

1-  1236 
0-4356 
0-4225 

0-  2809 
5-6644 

2-  2201 

1-  1664 
0-9409 
0-3481 

4-  0401 

2-  8900 
0-0841 
4-7089 
0-9216 
0-0676 
0-7569 
0-0625 
4  '8841 
0-4761 
0-0324 
0-06-25 
0-6561 
4-0000 
1  5376 
0-4489 
2-3104 

1-  68 

2-  64 

0-53 
476 

16-08 

3-  19 
- 

3-25 

73  66 
71-76 
102-60 
54-25 
14-49 
47-50 

8-  58 
16-64 
10-68 

4  06 
4-45 
4-24 

0-66 

4-  47 

5-  40 
4-85 
4  "13 

18-  70 

23-87 
10-56 
2-86 

9-  57 

30-94 
9-66 
2-52 
4-00 
1377 
34-00 
23-56 

19-  43 
62-32 

Mean 
15  11 

Mean 
3-67 

16,018 

63-0556 

32-13 

698-17 
32-13 

20-5d. 

1-29% 

2(xi/)= 

-  666-04 

i 
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average  weekly  earnings  of  agricultural  labourers  in  38  English 
Poor-law  unions  of  an  agricultural  type  (the  data  of  Example  i., 
Chap.  VIII.  p.  137).  (2)  Y — the  percentage  of  the  population 
in  receipt  of  Poor-law  relief  on  the  1st  January  1891  in  each  of  the 
same  unions  (JB  return).  The  means  of  each  of  the  variables  are 
calculated  in  the  ordinary  way,  and  then  the  deviations  x  and  i/ 
from  the  mean  are  written  down  (columns  4  and  5) :  care  must 
be  taken  to  give  each  deviation  the  correct  sign.  These  deviations 
are  then  squared  (columns  6  and  7)  and  the  standard  deviations 
found  as  before  (Chap.  VIII.  p.  136).  Finally,  every  x  is 
multiplied  by  the  associated  y  and  the  product  entered  in  column 
8  or  column  9  according  to  its  sign.  These  columns  are  then 
added  up  separately  and  the  algebraic  sum  of  the  totals  gives 
2(j:y)=  -666*04:  therefore  the  mean  product  j3  =  5(j?y)/iV=  - 
17*53,  and 

17-53  _ 
20-5  X  1-29 

There  is  therefore  a  well-marked  relation  exhibited  by  these  data 
between  the  earnings  of  agricultural  labourers  in  a  district  and 
the  percentage  of  the  population  in  receipt  of  Poor-law  relief. 
A  penny  is  rather  a  small  unit  in  which  to  measure  deviations  in 
the  average  earnings,  so  for  the  regressions  we  may  alter  the  unit 
of  a;  to  a  shilling,  making  o-^=  1*71,  and 

L  =      =  -  0-87,     6,  =      -  -  0  50. 

The  regression  equations  are  therefore,  in  terms  of  these  units, 

For  practical  purposes  it  is  more  convenient  to  express  the 
equations  in  terms  of  the  absolute  values  of  the  variables  rather 
than  the  deviations:  therefore,  replacing  x  by  -  15-94)  and  y 
by  (7-  3-67)  and  simplifying,  we  have 

Z=  19-13- 0-87  F  ....  (a) 
7=ll-64-0-50X   .       .       .       .  (b) 

the  units  being  Is.  for  the  earnings  and  1  per  cent,  for  the 
pauperism.  The  standard  errors  made  in  using  these  equations 
to  estimate  earnings  from  pauperism  and  pauperism  from  earnings 
respectively  are 

o-^  ^/^^^=15-4d.  =  ^28s. 
a-y  J\  -r^=  0-97  per  cent. 
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The  eqap<-'on  (b)  tells  us  therefore  that  a  rise  of  2s.  in  earnings 
in  passing  from  one  district  to  another  means  on  the  average  a 
fall  of  1  in  the  percentage  in  receipt  of  relief.  A  natural  con- 
clusion would  be  that  this  means  a  direct  effect  of  the  higher 
earnings  in  diminishing  the  necessity  for  relief,  but  such  a 
conclusion  cannot  be  accepted  offhand.  Equation  (a)  indicates, 
for  instance,  that  every  rise  of  a  unit  in  the  percentage  re- 
lieved corresponds  to  a  fall  of  0  87  shillings,  or  lOM.  in  earnings: 
this  might  mean  that  the  giving  of  relief  tends  to  depress  wages. 
Which  is  the  correct  interpretation  of  the  facts?    The  above 
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Fig.  40. — Correlation  between  Pauperism  and  Average  Earnings  of  Agricultural 
Labourers  for  certam  districts  of  England  (data  of  Table  VII.)  :  MH, 
CC,  lines  of  regression  :  r=  -  0*66. 


regression  equations  alone  cannot  tell  us  this,  and  it  is  in  the 
discussion  of  such  questions  that  most  of  the  difficulties  of  statisti- 
cal arguments  arise. 

As  a  check  on  the  whole  of  the  arithmetical  work,  and  to  test 
whether  the  correlation  coefficient  is  unduly  affected  by  a  few  out- 
lying observations,  or,  perhaps,  by  the  regression  not  being  linear, 
it  is  always  as  well  to  draw  a  diagram  representing  the  results 
obtained.  Take  scales  along  two  axes  at  right  angles  (fig.  40) 
representing  the  variables,  and  insert  a  dot  (better,  for  clearness, 
a  small  circle  or  a  cross)  at  the  point  determined  by  each  observed 
pair  of  X  and  Y.    Complete  the  diagram  by  inserting  the  two  lines 
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RR  and  CC  given  by  the  regression  equations  (a)  and  (b).  In 
doing  this  it  is  as  well  to  determine  a  point  at  each  end  of  both 
lines,  and  then  to  check  the  work  by  seeing  that  they  meet  in  the 
mean  of  the  whole  distribution.  Thus  RR  is  determined  from  (a) 
by  the  points  7=0,  ^  =  19-13  and  7=6,  Z=13-91:  CC  is 
determined  from  (b)  by  the  points  X=  12,  7=5  64  and  Jl  =  21, 
7  =  1'14.  Marking  in  these  points,  and  drawing  the  lines,  they 
will  be  found  to  meet  in  the  mean,  X=  15*94,  7=3-67.  The 
diagram  gives  a  very  clear  idea  of  the  distribution ;  clearly  the 
regression  is  as  nearly  linear  as  may  be  with  so  very  scattered  a 
distribution,  and  there  are  no  very  exceptional  observations.  The 
most  exceptional  districts  are  Brixworth  and  St  Neots  with  rather 
low  earnings  but  very  low  pauperism,  and  Glendale  and  Wigton 
with  the  highest  earnings  but  a  pauperism  well  above  the  lowest — 
over  2  per  cent. 

16.  When  a  classified  correlation-table  is  to  be  dealt  with,  the 
procedure  is  of  precisely  the  same  kind  as  was  used  in  the  calcula- 
tion of  a  standard  deviation,  the  same  artifices  being  used  to  shorten 
the  work.  That  is  to  say,  (1)  the  product-sum  is  calculated  in  the 
first  instance  with  respect  to  an  arbitrary  origin,  and  is  afterwards 
reduced  to  the  value  it  would  have  with  respect  to  the  mean ;  (2) 
the  arbitrary  origin  is  taken  at  the  centre  of  a  class-interval  ;  (3) 
the  class-interval  is  treated  as  the  unit  of  measurement  throughout 
the  arithmetic. 

Let  deviations  from  the  arbitrary  origin  be  denoted  by  ^i;,  and 
let  1^  be  the  co-ordinates  of  the  mean.  Then 

.*.     $r]  =  X?/ -\- i?/ +  yx  +  l^. 

Therefore,  summing,  since  the  second  and  third  sums  on  the 
right  vanish,  being  the  sums  of  deviations  from  the  mean, 

%{^r^)  =  ^{xy)  +  NlTi, 

or  bringing  ^{xy)  to  the  left, 

%{xy)  =  %{iyi)~Nlrj. 

That  is,  in  terms  of  mean-products,  using  p  to  denote  the  mean- 
product  for  the  arbitrary  origin. 

In  any  case  where  the  origin  from  which  deviations  have  been 
measured  is  not  the  mean,  this  correction  must  be  used.  It  will 
sometimes  give  a  sensible  correction  even  for  work  in  the  form  of 
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Example  i.,  and  in  that  case,  of  course,  the  standard  deviations 
will  also  require  reduction  to  the  mean. 

As  the  arithmetical  process  of  calculating  the  correlation  co- 
efficient from  a  grouped  table  is  of  great  importance,  we  give  two 
illustrations,  the  first  economic,  the  second  biological. 

Example  ii.,  Table  VIII. — The  two  variables  are  (1)  X,  the 
percentage  of  males  over  65  years  of  age  in  receipt  of  Poor-law 
relief  in  235  unions  of  a  mainly  rural  character  in  England  and 
Wales  ;  (2)  the  ratio  of  the  numbers  of  persons  given  relief  "  out- 
doors "  (in  their  own  homes)  to  one  "  indoors  "  (in  the  workhouse). 
The  figures  refer  to  a  one-day  count  (1st  August  1890,  No.  36, 
1890),  and  the  table  is  one  of  a  series  that  were  drawn  up  with 
the  view  to  discussing  the  influence  of  administrative  m.ethods  on 
pauperism.    {Economic  Journal,  vol.  vi.,  1896,  p.  613.) 

The  arbitrary  origin  for  X  was  taken  at  the  centre  of  the  fourth 
column,  or  at  17*5  per  cent.  ;  for  Y  at  the  centre  of  the  fourth 
row,  or  3*5.  The  following  are  the  values  found  for  the  constants 
of  the  single  distributions  :— 

1=  -0*1532  intervals  =  -0*77  per  cent.,  whence  M^  = 

16-73  per  cent. 
0-^  =  1 '29  intervals  =  6  45  per  cent. 

fj=  +0'36  intervals  or  units,  whence  Jfj,  =  3'86. 
oTy  =  2 "98  units. 

To  calculate  S(^r;),  the  value  of  is  first  written  in  every 
compartment  of  the  table  against  the  corresponding  frequency, 
treating  the  class-interval  as  the  unit :  these  are  the  figures  in 
heavy  type  in  Table  VIII.  In  making  these  entries  the  sign  of 
the  product  may  be  neglected,  but  it  must  be  remembered  that 
this  sign  will  be  positive  in  the  upper  left-hand  and  lower  right- 
hand  quadrants,  negative  in  the  two  others.  The  frequencies  are 
then  collected  as  shown  in  columns  2  and  3  of  Table  VIIIa., 
being  grouped  according  to  the  value  and  sign  of  ^77.  Thus  for 
1^77=1,  the  total  frequency  in  the  positive  quadrants  is  13  4- 8*5 
=  21-5,  in  the  negative  14-1-6  =  20:  for  ^77  =  2,  10 -f- 4*5 1 4*5 
=  20  in  the  positive  quadrants,  5-t-2-fl-l-3-5  =  ll-5  in  the 
negative,  and  so  on.  When  columns  2  and  3  are  completed,  they 
should  first  of  all  be  checked  to  see  that  no  frequency  has  been 
dropped,  which  may  be  readily  done  by  adding  together  the  totals 
of  these  two  columns  together  with  the  frequency  in  row  4  and 
column  4  of  Table  VIII.  (the  row  and  column  for  which  |7/  =  0), 
being  careful  not  to  count  twice  the  frequency  in  the  compartment 
common  to  the  two ;  this  grand  total  must  clearly  be  equal  to  the 
total  number  of  observations  or  235  in  the  present  case.  The 
algebraic  sum  of  the  frequencies  in  each  line  of  columns  2  and  3  is 
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Table  VIII.  Theory  of  Correlation  :  Exampleix. — Old-age  Pauperism  and 
Proportion  of  Out-rclief,  (Tlie  Frequencies  are  the  figures  printed  in  ordi- 
nary type.    The  numbers  in  lieavy  type  are  the  Deviation-Products  (|7j).) 


relieved 
Outdoors 
to  One 
Indoors. 

Percentage  of  J^Iales  over  65  in  receipt  of  Kelief. 

Total. 

0-5. 

5-10. 

10-15. 

15-20. 

20-25. 

25-30. 

30-35. 

35-40. 

0-  1  -( 

0-5 
9 

60 
6 

90 
3 

10 
0 

1-0 
12 

17-5 

1-  2  1 

3-5 
6 

13  0 
4 

100 

2 

14-0 

0 

5-0 
2 

45-5 

2-  3  1 

1-0 

3 

4-5 
2 

130 
1 

13-5 
0 

14-0 
1 

2-0 
2 

48-0 

»-  4  1 

10 

0 

4-5 
0 

7-5 
0 

14-0 

0 

140 

0 

30 

0 

440 

4-  5  ■[ 

5-  6  1 



1-0 

2 

60 
1 

11-5 
0 

8-5 
1 

1-0 

2 

28-0 

— 

3  5 
2 

30 
0 

4-5 
2 

2-0 
4 

— 

13  0 

6-  7  1 

- 

10 

6 

2-0 
3 

1-0 
0 

10 
0 

10 
0 

2  0 
3 

4-0 
6 

10 

9 

- 

11-0 

7-  8  1 

- 

0-5 
8 

1-0 

4 

30 
4 

- 

- 

- 

 — 

5  5 

8-  9  1 

- 

0-5 
10 

1-0 

5 
_ 

10 
5 

40 

10 

- 

- 

7-5 

9-10  { 

— 

i-n 
12 

20 
0 

4-0 

6 

_ 

— 

— 

70 

10-11 

- 

- 

- 

- 

- 

- 
— 

- 

- 

11-12  / 

— 

— 

— 

— 

2-0 

8 

- 

2-0 

12-13  1 

- 

- 

10 

9 

- 

- 

- 

10 

13-14  1 

- 

10 
20 

- 

- 

- 

- 

- 

10 

14-15 

— 

— 

— 

— 

— 

— 

- 

16-16  i 

1-0 

0 

1-0 

24 

20 

16-17 

17-18  1 

1-0 
28 

10 

18-19  1 

54-0 

1-0 

15 

1-0 

ToUIs 

6-0 

330 

630 

69-0 

18-0 

10 

10 

235  0 

Percentage  in  receipt  of  Relief  .  .  Mean  16*73  per  cent,  6*45  per  cent. 
Owt-relief  Ratio  Mean  3  86.  <7-,,2-98. 
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Table  VIIIa.  Calculation  of  the  Produoi:  Sum  S(|7j). 


1, 

2. 

3. 

4. 

5. 

6. 

Frequencies. 

Products. 

Total. 

+ 

Quadrants. 

Quadrants. 

Positive. 

Negative. 

1 

2 
3 
4 
5 
6 
8 
9 
]0 
12 
15 
20 
24 

21-5 

20 

1  9 

18 
1 

17-5 

2 

1-5 

4 

1 

1 
1 

20 
11-5 

o 

1 
1 

1 

0-5 

1 

0-5 
2 

1 

+  1-5 

+  8-5 
+  10 
+  17 

+  16-5 
+  1-5 
+  0-5 
+  3-5 

-  2 
+  1 

-  1 
+  1 

1-5 

17 

OU 

68 

99 
12 
4-5 

35 

15 

24 

28 

— 

24 
20 

Totals 

100-5 
41-5 

93 

41-5 

+  334 
_  44 

-44 

+  290 

235 

then  entered  in  column  4,  treating  the  frequencies  in  column  3  as  if 
they  were  themselves  negative,  and  finally  the  figures  of  column  4 
are  multiplied  by  the  values  of  ^rj  and  the  products  entered  in 
column  5  or  6  according  to  sign.  The  algebraic  sum  of  the  totals 
of  columns  5  and  6  =  +  290  =  ^^rj).  Whence  p'  =  mr))/A^=  1  -234. 
To  find  the  value  of  p  we  have,  remembering  that  we  are  working 
with  class-intervals  as  the  unit, 

$y)=  -  (0-153  X  0-36)=  -0-055 
29=2?'-|^=l-234  +  0-055=  +1-289 

"=  +  l^ra8=+^*^^- 


The  regression  of  pauperism  on  out-relief  ratio  is,  reverting  to 
1  per  cent,  as  the  unit  of  pauperism  instead  of  the  class-interval, 
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+  0-34x  6-45/2-98  =  0*74,  and  the  regression  equation  accordingly 
a;  =  0-74y,  or 

X=13'9  +  0-74r, 

the  standard  error  made  in  using  the  equation  for  estimating  X 
from  Y  being  cxx  J\  -    =  6 "07. 

This  is  the  equation  of  greatest  practical  interest,  telling  us 
that,  as  we  pass  from  one  district  to  another,  a  rise  of  1  in  the 
ratio  of  the  numbers  relieved  in  their  own  homes  to  the  numbers 
relieved  in  the  workhouse  corresponds  on  an  average  to  a  rise  of 
0*74  in  the  percentage  in  receipt  of  relief.  The  result  is  such  as 
to  create  a  presumption  in  favour  of  the  view  that  the  giving  of 
out-relief  tends  to  increase  the  numbers  relieved,  and  this  can  be 
taken  as  a  working  hypothesis  for  further  investigation. 

The  student  should  work  out  the  second  regression  equation, 
and  check  both  by  calculating  the  means  of  the  principal  rows 
and  columns,  and  drawing  a  diagram  like  figs.  36,  37,  and  38. 

Example  iii.,  Table  IX. — (Unpublished  data ;  measurements  by 
G.  U.  Yule.)  The  two  variables  are  (1)  X,  the  length  of  a  mother- 
frond  of  duckweed  (Lemna  minor) ;  (2)  Y,  the  length  of  the 
daughter-frond.  The  mother-frond  was  measured  when  the 
daughter-frond  separated  from  it,  and  the  daughter-frond  when 
its  first  daughter-frond  separated.  Measures  were  taken  from 
camera  drawings  made  with  the  Zeiss-Abbe  camera  under  a  low 
power,  the  actual  magnification  being  24  :  1.  The  units  of  length 
in  the  tabulated  measurements  are  millimetres  on  the  drawings. 

The  arbitrary  origin  for  both  X  and  Y  was  taken  at  105  mm. 
The  following  are  the  values  found  for  the  constants  of  the  single 
distributions : — 


1= 

-1*058  intervals  = 

-  6-3  mm. 

3f,= 

987  mm.  on  drawing. 
4  "11  mm.  actual. 

2-828  intervals  = 

17-0  mm. 

on  drawing  = 

0707  mm.  actual. 

V  = 

-0-203      ,,  = 

-  1*2  mm. 

103 '8  mm.  on  drawing. 
4*32  mm.  actual. 

<ry  = 

3-084       „  = 

18"5  mm. 

on  drawing];  = 

0771  mm.  actual. 

The  values  of  irj  are  entered  in  every  compartment  of  the 
table  as  before,  and  the  frequencies  then  collected,  according  to 
the  magnitude  and  sign  of  $rj,  in  columns  2  and  3  of  Table  IXa. 
The  entries  in  these  two  columns  are  next  checked  by  adding  to 
the  totals  the  frequency  in  the  row  and  column  for  which  $r]  is 
zero,  and  seeing  that  it  gives  the  total  number  of  observations 
(266).  The  numbers  in  column  4  are  given  by  deducting  the 
entries  in  column  3  from  those  in  column  2.  The  totals  so 
obtained  are  multiplied  by  ^rj  (column  1)  and  the  products  entered 
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Table  IX a.  : 

 ] 


1 

1. 

2. 

3. 

4. 

5. 

6. 

Frequencies, 

Products, 

Total. 

+ 

Quadrants. 

((Quadrants. 

+ 

- 

1 

— 

8-5 

-  8-5 

— 

8-5 

2 

17 

13'5 

+  3-5 

7 

— 

3 

105 

9 

+  1-5 

45 

— 

4 

13-5 

6*5 

+  7 

28 

— 

5 

2 

0-5 

+  1-5 

7  "5 

— 

6 

13-5 

5 

+  8-5 

51 

— 

8 

13 

1 

+  12 

96 

— 

9 

9 

4 

+  5 

45 

— 

10 

6-5 

1 

+  5-5 

55 

— 

12 

17-5 

— 

+  17-5 

210 

— 

14 

1 

— 

+  1 

14 

— 

15 

6 

— 

+  6 

90 

— 

16 

7 

+  7 

112 

18 

2 

— 

+  2 

36 

— 

20 

Q 
O 

+  8 

1  fiA 

i  ou 

21 

2 

— 

+  2 

42 

— 

24 

a 
0 

+  6 

T  A  A 
144 

25 

1 

+  1 

ZD 

28 

1 

+  1 

OQ 

zo 

30 

o 
O 

+  3 

yu 

36 

1 

— 

+  1 

36 

— 

40 

1 

+  1 

40 

42 

2 

+  2 

84 

60 

1 

+  1 

60 

63 

1 

+  1 

63 

Totals 

145-5 

49 

+  1528 

-8-5 

49 

8-5 

71-5 

1519-5 

266 

in  column  5  or  6  according  to  sign.  The  algebraic  sum  of  the 
totals  of  these  two  columns  gives  =+ 1519-5.  Dividing 

by  266,  p  =5-112.  But  + 1-058  x  0-203  =  +0-215;  there- 
fore5-712  -  0-215  =  5-497. 


^       2-828  x  3-084 
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Table  IX.  Theory  of  Correlation:  Illustration  iii. — Correlation  letween  (1)  length  of  mother-frond,  (2)  length  of 
daughter-frond,  in  Lemna  minor.  [Unpublished  data  ;  6.  TJ.  Yule.]  (The  frequencies  are  the  figures  printed  in  ordinary 
type.    The  numbers  in  heavy  type  are  the  deviation-products  (I??)). 


(1)  Length  of  mother-frond  (mm.  of  camera  drawing  enlarged  24  : 1). 


I  10-5 


30 


1 
i 

I 
i 

i 

i 
] 

i 


i 


( 

! 
I 
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The  regression  of  daughter-frond  on  mother-frond  is  0-69  (a 
ralue  which  will  not  be  altered  by  altering  the  units  of  measure- 
ment for  both  mother-  and  daughter-fronds,  as  such  an  alteration 
will  affect  both  standard  deviations  equally).  Hence  the  re- 
gression equation  giving  the  average  actual  length  (in  millimetres) 
of  daughter-fronds  for  mother-fronds  of  actual  length  X  is 

7=l-48-f  0-69X 

We  again  leave  it  to  the  student  to  work  out  the  second 
regression  equation  giving  the  average  length  of  mother-fronds 
for  daughter-fronds  of  length  F,  and  to  check  the  whole  work 
by  a  diagrarn  showing  the  lines  of  regression  and  the  means  of 
arrays  for  the  central  portion  of  the  table. 

17.  The  student  should  be  careful  to  remember  the  following 
points  in  working : — 

(1)  To  give  p  and  ^rj  their  correct  signs  in  finding  the  true 
mean  deviation-product  p. 

(2)  To  express  o-^  and  a-y  in  terms  of  the  class-interval  as  a 
unit,  in  the  value  of  r  =  /?/a-^  o-y,  for  these  are  the  units  in  terms 
of  which  p  has  been  calculated. 

(3)  To  use  the  proper  units  for  the  standard  deviations  (not 
class-intervals  in  general)  in  calculating  the  coefficients  of 
regression  :  in  forming  the  regression  equation  in  terms  of  the 
absolute  values  of  the  variables,  for  example,  as  above,  the  work 
will  be  wrong  unless  means  and  standard  deviations  are  ex- 
nressed  in  the  same  units. 

Further,  it  must  always  be  remembered  that  correlation 
coefficients,  like  all  other  statistical  measures,  are  subject  to 
fluctuations  of  sampling  {cf.  Chap.  III.  §§  7,  8).  If  we  write 
on  cards  a  series  of  pairs  of  strictly  independent  values  of  x  and 
y  and  then  work  out  the  correlation  coefficient  for  samples  of, 
say,  40  or  50  cards  taken  at  random,  we  are  very  unlikely  ever 
to  find  r  =  0  absolutely,  but  will  find  a  series  of  positive  and 
negative  values  centring  round  0.  No  great  stress  can  therefore 
be  laid  on  small,  or  even  on  moderately  large,  values  of  r  as 
indicating  a  true  correlation  if  the  numbers  of  observations  be 
small.  For  instance,  if  iV^=36,  a  value  of  r=  ±0'5  may  be 
merely  a  chance  result  (though  a  very  infrequent  one) ;  if 
iV=  100,  r=  ±0"3  may  similarly  be  a  mere  fluctuation  of 
sampling,  though  again  an  infrequent  one.  If  iV=900,  a  value 
of  r=  ±0*1  might  occur  as  a  fluctuation  of  sampling  of  the  same 
degree  of  infrequency.  The  student  must  therefore  be  careful  in 
interpreting  his  coefficients.    (See  Chap.  XVII.  §  15.) 

Finally,  it  should  be  borne  in  mind  that  any  coeflficient,  e.g.  the 
coefficient  of  correlation  or  the  coefficient  of  contingency,  gives 
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only  a  part  of  the  information  afforded  by  the  original  data  or 
the  correlation  table.  The  correlation  table  itself,  or  the  original 
data  if  no  correlation  table  has  been  compiled,  should  always  be 
given,  unless  considerations  of  space  or  of  expense  absolutely 
preclude  the  adoption  of  such  a  course. 
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EXERCISES. 

1.  Find  the  correlation-coefficient  and  the  equations  of  regression  for  the 
following  values  of  X  and  V. 

X.  Y. 

1  2 

2  5 

3  3 

4  8 

5  7 

[As  a  matter  of  practice  it  is  never  worth  calculating  a  correlation -coefficient 
for  so  few  observations  :  the  figures  are  given  solely  as  a  short  example  on 
which  the  student  can  test  his  knowledge  of  the  work.] 

2.  The  following  figures  show,  for  the  districts  of  Example  i.,  the  ratios  of 
the  numbers  of  paupers  in  receipt  of  outdoor  relief  to  the  numbers  in  receipt 
of  relief  in  the  workhouse.  Find  the  correlations  between  the  out-relief  ratio 
and  (1)  the  estimated  earnings  of  agricultural  labourers;  (2)  the  percentage 
of  the  population  in  receipt  of  relief. 


1 

6-40 

14 

7-50 

27 

2-97 

2 

4-04 

15 

4-44 

28 

5-38 

3 

7-90 

16 

8-34 

29 

3-24 

4 

3-31 

17 

0-69 

30 

7-61 

5 

7-85 

18 

9-89 

31 

5-87 

6 

0-45 

19 

4-00 

32 

5-50 

7 

10-00 

20 

6-02 

33 

3-58 

8 

4-43 

21 

8-27 

34 

6-93 

9 

4-78 

22 

1'58 

35 

6-02 

10 

4-73 

23 

16-04 

36 

4-92 

11 

6-66 

24 

1-96 

37 

4-64 

12 

1-22 

25 

9-28 

38 

10-56 

13 

4-27 

26 

8-72 

3.  Verify  the  following  data  for  the  under-mentioned  tables  of  the  preceding 
chapter.  Calculate  the  means  of  rows  and  columns  and  draw  diagrams  showing 
the  lines  of  regression,  as  figs.  36-39,  for  one  or  two  cases  at  least. 


I. 

II. 

III. 

IV. 

VI. 

Mean  of  X  . 
„     F    .  . 

Standard     devia-  \ 
tion  o{  X  .  .] 
Standard     devia-  \ 
tion  o{  Y .       .  j 
Coefficient  of  corre-  \ 
lation      .       .  / 

55 '3  mm. 
53-1  ,, 

6-86 
5-77 
-f  0-97 

40-6  years 
42-8  „ 

12-7  „ 
131  „ 

-I-0-91 

67*70  ins. 
68-66  ,, 

2-72  „ 
2-75  „ 
4-0-51 

5-90 
4-33 

2-83 
2-97 
-fO-21 

609-2 
14,500 

7*46 
18,100 
-0-014 

Coefficient  of  con-^ 
tingency  (for  the  1 
grouping  stated  j 
below)     .       .  j 

0-90 

0-81 

0  51 

0-31 

0-47  ' 

! 
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In  calculating  the  coefficient  of  contingency  (coefficient  of  mean  square 
contingency)  use  the  following  groupings,  so  as  to  avoid  small  scattered  fre- 
quencies at  the  extremities  of  the  tables  and  also  excessive  arithmetic  : — 

I.  Group  together  (1)  two  top  rows,  (2)  three  bottom  rows,  (3)  two  first 
columns,  (4)  four  last  columns,  leaving  centre  of  table  as  it  stands. 

II.  Regroup  by  ten-year  intervals  (15-,  25-,  35-,  etc.)  for  both  husband  and 
wife,  making  the  last  group  "65  and  over." 

III.  Regroup  by  2-inch  intervals,  58 •5-60*5,  etc.,  for  father,  59 '5-61  "5, 
etc.,  for  son.  If  a  3-inch  grouping  be  used  (58  •5-61*5,  etc.,  for  both  father  and 
son),  the  coefficient  of  mean  square  contingency  is 0 •465.  [Both  results  cited 
from  Pearson,  ref.  1  of  Chap,  V.] 

IV.  For  cols.,  group  1  +  2,  3  +  4,  .  .  .  ,  11  +  12,  13  and  upwards.  Rows, 
0,  1+2,  3  +  4,  .  .  .  ,  9  +  10,  11  and  upwards. 

VI.  For  cols.,  group  all  up  to  494*5  and  all  over  521  ^5,  leaving  central  cols. 
Rows  singly  up  20  :  then  20-28,  28-44,  44-56,  56  upwards. 


CHAPTER  X. 


CORRELATION:  ILLUSTRATIONS  AND  PRACTICAL 
METHODS. 

1.  Necessity  for  careful  choice  of  variables  before  proceeding  to  calculate  r — 
2-8.  Illustration  i.  :  Causation  of  pauperism — 9-10.  Illustration 
ii. :  Inheritance  of  fertility — 11-13.  Illustration  iii.:  The  weather 
and  the  crops — 14.  Correlation  between  the  movements  of  two 
variables: — (a)  Non-periodic  movements:  Illustration  iv. :  Changes 
in  infantile  and  general  mortality — 15-17.  (b)  Quasi- periodic  move- 
ments :  Illustration  v. :  The  marriage  -  rate  and  foreign  trade — 
18.  Elementary  methods  of  dealing  with  cases  of  non-linear  regression 
— 19.  Certain  rough  methods  of  approximating  to  the  correlation 
coefficient — 20-22.  The  correlation  ratio. 

1.  The  student — especially  the  student  of  economic  statistics,  to 
whom  this  chapter  is  principally  addressed — should  be  careful  to 
note  that  the  coefficient  of  correlation,  like  an  average  or  a 
measure  of  dispersion,  only  exhibits  in  a  summary  and  compre- 
hensible form  one  particular  aspect  of  the  facts  on  which  it  is 
based,  and  the  real  difficulties  arise  in  the  interpretation  of  the 
coefficient  when  obtained.  The  value  of  the  coefficient  may  be 
consistent  with  some  given  hypothesis,  but  it  may  be  equally 
consistent  with  others ;  and  not  only  are  care  and  judgment 
essential  for  the  discussion  of  such  possible  hypotheses,  but  also 
a  thorough  knowledge  of  the  facts  in  all  other  possible  aspects. 
Further,  care  should  be  exercised  from  the  commencement  in  the 
selection  of  the  variables  between  which  the  correlation  shall  be 
determined.  The  variables  should  be  defined  in  such  a  way  as 
to  render  the  correlations  as  readily  interpre table  as  possible, 
and,  if  several  are  to  be  dealt  with,  they  should  afford  the  answers 
to  specific  and  definite  questions.  Unfortunately,  the  field  of 
choice  is  frequently  very  much  limited,  by  deficiencies  in  the 
available  data  and  so  forth,  and  consequently  practical  possibilities 
as  well  as  ideal  requirements  have  to  be  taken  into  account.  No 
general  rules  can  be  laid  down,  but  the  following  are  given  as 
illustrations  of  the  sort  of  points  that  have  to  be  considered. 
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2.  Illustration  i. — It  is  required  to  throw  some  light  on  the 
variations  of  pauperism  in  the  unions  (unions  of  parishes)  of 
England.    {Cf.  Yule,  ref.  2.) 

One  table  (Table  VIII.)  bearing  on  a  part  of  this  question,  viz. 
the  influence  of  the  giving  of  out-relief  on  the  proportion  of  the 
aged  in  receipt  of  relief,  was  given  in  Chap.  IX,  (p.  183).  The 
question  was  treated  by  correlating  the  percentage  of  the  aged 
relieved  in  diflerent  districts  with  the  ratio  of  numbers  relieved 
outdoors  to  the  numbers  in  the  workhouse.  Is  such  a  method 
the  best  possible  ? 

On  the  whole,  it  would  seem  better  to  correlate  changes  in 
pauperism  with  changes  in  various  possible  factors.  If  we  say- 
that  a  high  rate  of  pauperism  in  some  district  is  due  to  lax 
administration,  we  presumably  mean  that  as  administration 
became  lax,  pauperism  rose,  or  that  if  administration  were  more 
strict,  pauperism  would  decrease  ;  if  we  say  that  the  high  pauper- 
ism is  due  to  the  depressed  condition  of  industry,  we  mean  that 
when  industry  recovers,  pauperism  will  fall.  When  we  say,  in 
fact,  that  any  one  variable  is  a  factor  of  pauperism,  we  mean 
that  changes  in  that  variable  are  accompanied  by  changes  in  the 
percentage  of  the  population  in  receipt  of  relief,  either  in  the 
same  or  the  reverse  direction.  It  will  be  better,  therefore,  to 
deal  with  changes  in  pauperism  and  possible  factors.  The  next 
question  is  what  factors  to  choose. 

3.  The  possible  factors  may  be  grouped  under  three  heads : — 
(a)  Administration. — Changes  in  the  method  or  strictness  of 

administration  of  the  law. 

{b)  Environment.  —  Changes  in  economic  conditions  (wages, 
prices,  employment),  social  conditions  (residential  or  industrial 
character  of  the  district,  density  of  population,  nationality  of 
population),  or  moral  conditions  (as  illustrated,  e.g.^  by  the  statis- 
tics of  crime). 

(c)  Age  Distribution. — the  percentage  of  the  population  between 
given  age-limits  in  receipt  of  relief  increases  very  rapidly  with  old 
age,  the  actual  figures  given  by  one  of  the  only  two  then  existing 
returns  of  the  age  of  paupers  being — 2  per  cent,  under  age  16, 
1  per  cent,  over  16  but  under  65,  20  per  cent,  over  65.  (Return 
36,  1890.) 

It  is  practically  impossible  to  deal  with  more  than  three  factors, 
one  from  each  of  the  above  groups,  or  four  variables  alto- 
gether, including  the  pauperism  itself.  What  shall  we  take,  then, 
as  representative  variables,  and  how  shall  we  best  measure 
"  pauperism  "  ] 

4.  Pauperism. — The  returns  give  (a)  cost,  {b)  numbers  relieved. 
It  seems  better  to  deal  with  (b)  (as  in  the  illustration  of  Table 
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VIII.,  Chap.  IX.),  as  numbers  are  more  important  than  cost  from 
the  standpoint  of  the  moral  effect  of  relief  on  the  population. 
The  returns,  however,  generally  include  both  lunatics  and  vagrants 
in  the  totals  of  persons  relieved  ;  and  as  the  administrative  methods 
of  dealing  with  these  two  classes  differ  entirely  from  the  methods 
applicable  to  ordinary  pauperism,  it  seems  better  to  alter  the 
official  total  by  excluding  them.  Returns  are  available  giving 
the  numbers  in  receipt  of  relief  on  1st  January  and  1st  July; 
there  does  not  seem  to  be  any  special  reason  for  taking  the  one 
return  rather  than  the  other,  but  the  return  for  1st  January  was 
actually  used.  The  percentage  of  the  population  in  receipt  of 
relief  on  1st  January  1871,  1881,  and  1891  (the  three  census 
years),  less  lunatics  and  vagrants,  was  therefore  tabulated  for  each 
union.    (The  investigation  was  carried  out  in  1898.) 

5.  Administration. — The  most  important  point  here,  and  one 
that  lends  itself  readily  to  statistical  treatment,  is  the  relative 
proportion  of  indoor  and  outdoor  relief  (relief  in  the  workhouse 
and  relief  in  the  applicant's  home).  The  first  question  is, 
again,  shall  we  measure  this  proportion  by  cost  or  by  numbers  1 
The  latter  seems,  as  before,  the  simpler  and  more  important  ratio 
for  the  present  purpose,  though  some  writers  have  preferred  the 
statement  in  terms  of  expenditure  {e.g.  Mr  Charles  Booth,  Aged 
Poor — Condition,  1894).  If  we  decide  on  the  statement  in  terms 
of  numbers,  we  still  have  the  choice  of  expressing  the  proportion  (1) 
as  the  ratio  of  numbers  given  out-relief  to  numbers  in  the  work- 
liouse,  or  (2)  as  the  percentage  of  numbers  given  out-relief  on 
the  total  number  relieved.  The  former  method  was  chosen, 
partly  on  the  simple  ground  that  it  had  already  been  used  in  an 
earlier  investigation,  partly  on  the  ground  that  the  use  of  the 
ratio  separates  the  higher  proportions  of  out-relief  more  clearly 
from  each  other,  and  these  differences  seem  to  have  significance. 
Thus  a  union  with  a  ratio  of  15  outdoor  paupers  to  one  indoor 
seems  to  be  materially  different  from  one  with  a  ratio  of,  say,  10 
to  1 ;  but  if  we  take,  instead  of  the  ratios,  the  percentages  of 
outdoor  to  total  paupers,  the  figures  are  94  per  cent,  and  91  per 
cent,  respectively,  which  are  so  close  that  they  will  probably  fall 
into  the  same  array.  The  ratio  of  numbers  in  receipt  of  outdoor 
relief  to  the  numbers  in  the  workhouse,  in  every  union,  was 
therefore  tabulated  for  1st  January  in  the  census  years  1871,  1881, 
1891. 

6.  Environment. — This  is  the  most  difficult  factor  of  all  to  deal 
with.  In  Mr  Booth's  work  the  factors  tabulated  were  (1)  persons 
per  acre ;  (2)  percentage  of  population  living  two  or  more  to  a 
room,  I.e.  "overcrowding";  (3)  rateable  value  per  head  {Aged  Poor — 
Condition).    The  data  relating  to  overcrowding  were  f^rst  collected 
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at  the  census  of  1891,  and  are  not  available  for  earlier  years. 
Some  trial  was  made  of  rateable  value  per  head,  but  with  not 
very  satisfactory  results.  For  any  given  year,  and  for  a  group  of 
unions  of  somewhat  similar  character,  e.g.  rural,  the  rateable  value 
per  head  appears  to  be  highly  (negatively)  correlated  with  the 
pauperism,  but  changes  in  the  two  are  not  very  highly  correlated  : 
probably  the  movements  of  assessments  are  sluggish  and  irregular, 
especially  in  the  case  of  falling  assessments  in  rural  unions,  and 
do  not  correspond  at  all  accurately  with  the  real  changes  in  the 
value  of  agricultural  land.  After  some  consideration,  it  was 
decided  to  use  a  very  simple  index  to  the  changing  fortunes  of  a 
district,  viz.  the  movement  of  the  population  itself.  If  the 
population  of  a  district  is  increasing  at  a  rate  above  the  average, 
this  is  'primd facie  evidence  that  its  industries  are  prospering;  if 
the  population  is  decreasing,  or  not  increasing  as  fast  as  the 
average,  this  strongly  suggests  that  the  industries  are  suffering 
from  a  temporary  lack  of  prosperity  or  permanent  decay.  The 
population  of  every  union  was  therefore  tabulated  for  the  censuses 
of  1871,  1881,  1891. 

7.  Age  Distribution. — As  already  stated,  the  figures  that  are 
known  clearly  indicate  a  very  rapid  rise  of  the  percentage  relieved 
after  65  years  of  age.  The  percentage  of  the  population  over  65 
years  of  age  was  therefore  worked  out  for  every  union  and  tabu- 
lated from  the  same  three  censuses.  This  is  not,  of  course, 
at  all  a  complete  index  to  the  composition  of  the  population  as 
affecting  the  rate  of  pauperism,  which  is  sensibly  dependent  on 
the  proportion  of  the  two  sexes,  and  the  numbers  of  children  as 
well.  As  the  percentage  in  receipt  of  relief  was,  however,  20  per 
cent,  for  those  over  65,  and  only  1-2  per  cent,  for  those  under  that 
age,  it  is  evidently  a  most  important  index.  (A  more  complete 
method  might  have  been  used  by  correcting  the  observed  rate  of 
pauperism  to  the  basis  of  a  standard  population  with  given  num- 
bers of  each  age  and  sex.    {Cf.  below.  Chap.  XI.  pp.  223-25.) 

8.  The  changes  in  each  of  the  four  quantities  that  had  been 
tabulated  for  every  union  were  then  measured  by  working  out  the 
ratios  for  the  intercensal  decades  1871-81  and  1881-91,  taking 
the  value  in  the  earlier  year  as  100  in  each  case.  The  percentage 
ratios  so  obtained  were  taken  as  the  four  variables.  Further,  as 
the  conditions  are  and  were  very  different  for  rural  and  for  urban 
unions,  it  seemed  very  desirable  to  separate  the  unions  into  groups 
according  to  their  character.  But  this  cannot  be  done  with  any 
exactness :  the  majority  of  unions  are  of  a  mixed  character,  con- 
sisting, say,  of  a  small  town  with  a  considerable  extent  of  the 
surrounding  country.  It  might  seem  best  to  base  the  classification 
on  returns  of  occupations^  e.g.  the  proportions  of  the  population 
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engaged  in  agriculture,  but  the  statistics  of  occupations  are  not 
given  in  the  census  for  individual  unions.  Finally,  it  was  decided 
to  use  a  classification  by  density  of  population,  the  grouping  used 
being — Rural,  0'3  person  per  acre  or  less :  Mixed,  more  than 
0"3  but  not  more  than  1  person  per  acre  :  Urban,  more  than  1  person 
per  acre.  The  metropolitan  unions  were  also  treated  by  them- 
selves. The  limit  0*3  for  rural  unions  was  suggested  by  the 
density  of  those  agricultural  unions  the  conditions  in  which 
were  investigated  by  the  Labour  Commission  (the  unions  of 
Table  VII.,  Chap.  IX.) :  the  average  density  of  these  was  0*25, 
and  34  of  the  38  were  under  0-3.  The  lower  limit  of  density  for 
urban  unions — 1  per  acre — was  suggested  by  a  grouping  of  Mr 
Booth's  (group  xiv.) :  of  course  1  person  per  acre  is  not  a  density 
associated  with  an  urban  district  in  the  ordinary  sense  of  the 
term,  but  a  country  district  cannot  reach  this  density  unless  it 
include  a  small  town  or  portion  of  a  town,  i.e.  unless  a  large 
proportion  of  its  inhabitants  live  under  urban  conditions. 

The  method  by  which  the  relations  between  four  variables  are 
discussed  is  fully  described  in  Chapter  XII.  :  at  the  present  stage 
it  can  only  be  stated  that  the  discussion  is  based  on  the  correlations 
between  all  the  possible  (6)  pairs  that  can  be  formed  from  the  four 
variables. 

9.  Illustration  ii. — The  subject  of  investigation  is  the  inheritance 
of  fertility  in  man.  (Cf.  Pearson  and  others,  ref.  3.)  One  table, 
from  the  memoir  cited,  was  given  as  an  example  in  the  last  chapter 
(Table  IV.). 

Fertility  in  man  {i.e.  the  number  of  children  born  to  a  given  pair) 
is  very  largely  influenced  by  the  age  of  husband  and  wife  at 
marriage  (especially  the  latter),  and  by  the  duration  of  marriage. 
It  is  desired  to  find  whether  it  is  also  influenced  by  the  heritable 
constitution  of  the  parents,  i.e.  whether,  allowance  being  made  for 
the  eff'ect  of  such  disturbing  causes  as  age  and  duration  of  marriage, 
fertility  is  itself  a  heritable  character. 

The  effect  of  duration  of  marriage  may  be  largely  eliminated 
by  excluding  all  marriages  which  have  not  lasted,  say,  15  years 
at  I'^ast.  This  will  rather  heavily  reduce  the  number  of  records 
available,  but  will  leave  a  sufficient  number  for  discussion.  It 
would  be  desirable  to  eliminate  the  effect  of  late  marriages  in 
the  same  way  by  excluding  all  cases  in  which,  say,  husband  was 
over  30  years  of  age  or  wife  over  25  (or  even  less)  at  the  time 
of  marriage.  But,  unfortunately,  this  is  impossible  ;  the  age  of 
the  wife — the  most  important  factor — is  only  exceptionally  given 
in  peerages,  family  histories,  and  similar  works,  from  which  the 
data  must  be  compiled.  All  marriages  must  therefore  be 
inoluded,  whatever  the  age  of  the  parents  at  marriage,  and  the 
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eifect  of  the  varying  age  at  marriage  must  be  estimated 
afterwards. 

10.  But  the  correlation  between  (1)  number  of  children  of  a 
woman  and  (2)  number  of  children  of  her  daughter  will  be  further 
affected  according  as  we  include  in  the  record  all  her  available 
daughters  or  only  one.  Suppose,  e.g.^  the  number  of  children  in 
the  first  generation  is  5  (say  the  mother  and  her  brothers  and 
sisters),  and  that  she  has  three  daughters  with  0,  2,  and  4 
children  respectively :  are  we  to  enter  all  three  pairs  (5,  0), 
(5,  2),  (5,  4)  in  the  correlation-table,  or  only  one  pair  %  If  the 
latter,  which  pair  %  For  theoretical  simplicity  the  second  process 
is  distinctly  the  best  (though  it  still  further  limits  the  available 
data).  If  it  be  adopted,  some  regular  rule  will  have  to  be  made 
for  the  selection  of  the  daughter  whose  fertility  shall  be  entered 
in  the  table,  so  as  to  avoid  bias :  the  first  daughter  married 
for  whom  data  are  given,  and  who  fulfils  the  conditions  as  to 
duration  of  marriage,  may,  for  instance,  be  taken  in  every  case. 
(For  a  much  more  detailed  discussion  of  the  problem,  and  the 
allied  problems  regarding  the  inheritance  of  fertility  in  the  horse, 
the  student  is  referred  to  the  original.) 

11.  Illustration  iii. — The  subject  for  investigation  is  the 
relation  between  the  bulk  of  a  crop  (wheat  and  other  cereals, 
turnips  and  other  root  crops,  hay,  etc.),  and  the  weather.  ((7/ 
Hooker,  ref.  7.) 

Produce-statistics  for  the  more  important  crops  of  Great 
Britain  have  been  issued  by  the  Board  of  Agriculture  since 
1885  :  the  figures  are  based  on  estimates  of  the  yield  furnished 
by  official  local  estimators  all  over  the  country.  Estimates  are 
published  for  separate  counties  and  for  groups  of  counties 
(divisions).  But  the  climatic  conditions  vary  so  much  over  the 
United  Kingdom  that  it  is  better  to  deal  with  a  smaller  area, 
more  homogeneous  from  the  meteorological  standpoint.  On  the 
other  hand,  the  area  should  not  be  too  small ;  it  should  be  large 
enough  to  present  a  representative  variety  of  soil.  The  group 
of  eastern  counties,  consisting  of  Lincoln,  Hunts,  Cambridge, 
Norfolk,  Suffolk,  Essex,  Bedford,  and  Hertford,  was  selected  as 
fulfilling  these  conditions.  The  group  includes  the  county  with 
the  largest  acreage  of  each  of  the  ten  crops  investigated,  with 
the  single  exception  of  permanent  grass. 

12.  The  produce  of  a  crop  is  dependent  on  the  weather  of 
a  long  preceding  period,  and  it  is  naturally  desired  to  find  the 
influence  of  the  w^eather  at  all  successive  stages  during  this 
period,  and  to  determine,  for  each  crop,  which  period  of  the 
year  is  of  most  critical  importance  as  regards  w^eather.  It  must 
be  remembered,  however,  that  the  times  of  both  sowing  and 
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harvest  are  themselves  very  largely  dependent  on  the  weather, 
and  consequently,  on  an  average  of  many  years,  the  limits  of 
the  critical  period  will  not  be  very  well  defined.  If,  therefore, 
we  correlate  the  produce  of  the  crop  (X)  with  the  characteristics 
of  the  weather  (Y)  during  successive  intervals  of  the  year,  it 
will  be  as  well  not  to  make  these  intervals  too  short.  It  was 
accordingly  decided  to  take  successive  groups  of  8  weeks,  over- 
lapping each  other  by  4  weeks,  i.e.  weeks  1-8,  5-12,  etc. 
Correlation  coefficients  were  thus  obtained  at  4-weeks  intervals, 
but  based  on  8  weeks'  weather. 

1 3.  It  remains  to  be  decided  what  characteristics  of  the  weather 
are  to  be  taken  into  account.  The  rainfall  is  clearly  one  factor 
of  great  importance,  temperature  is  another,  and  these  two  will 
afford  quite  enough  labour  for  a  first  investigation.  The  weekly 
rainfalls  were  averaged  for  eight  stations  within  the  area,  and 
the  average  taken  as  the  first  characteristic  of  the  weather. 
Temperatures  were  taken  from  the  records  of  the  same  stations. 
The  average  temperatures,  however,  do  not  give  quite  the  sort 
of  information  that  is  required :  at  temperatures  below  a  certain 
limit  (about  42°  Fahr.)  there  is  very  little  growth,  and  the 
growth  increases  in  rapidity  as  the  temperature  rises  above  this 
point  (within  limits).  It  was  therefore  decided  to  utilise  the 
figures  for  "accumulated  temperatures  above  42°  Fahr.,"  i.e. 
the  total  number  of  day-degrees  above  42°  during  each  of  the 
8-weekly  periods,  as  the  second  characteristic  of  the  weather ; 
these  "accumulated  temperatures,"  moreover,  show  much  large? 
variations  than  mean  temperatures. 

The  student  should  refer  to  the  original  for  the  full  dis- 
cussion as  to  data.  The  method  of  treating  the  correlations 
between  three  variables,  based  on  the  three  possible  correlations 
between  them,  is  described  in  Chapter  XII. 

14.  Problems  of  a  somewhat  special  kind  arise  when  dealing 
with  the  relations  between  simultaneous  values  of  two  variables 
which  have  been  observed  during  a  considerable  period  of  time, 
for  the  more  rapid  movements  will  often  exhibit  a  fairly  close 
consilience,  while  the  slower  changes  show  no  similarity.  The  two 
following  examples  will  serve  as  illustrations  of  two  methods  which 
are  generally  applicable  to  such  cases. 

Illustration  iv. — Fig.  41  exhibits  the  movements  of  (1)  the 
infantile  mortality  (deaths  of  infants  under  1  year  of  age  per  1000 
births  in  the  same  year) ;  (2)  the  general  mortality  (deaths  at  all 
ages  per  1000  living)  in  England  and  Wales  during  the  period 
1838-1904.  A  very  cursory  inspection  of  the  figure  shows  that 
when  the  infantile  mortality  rose  from  one  year  to  the  next 
the  general  mortality  also  rose,  as  a  rule ;  and  similarly,  when  the 
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infantile  mortality  fell,  the  general  mortality  also  fell.  There 
were,  in  fact,  only  five  or  six  exceptions  to  this  rule  during  the 
whole  period  under  review.  The  correlation  between  the  annual 
values  of  the  two  mortalities  would  nevertheless  not  be  very  high, 
as  the  general  mortality  has  been  falling  more  or  less  steadily  since 
1875  or  thereabouts,  while  the  infantile  mortality  attained  almost 
a  record  value  in  1899.  During  a  long  period  of  time  the  correla- 
tion between  annual  values  may,  indeed,  very  well  vanish,  for  the 
two  mortalities  are  affected  by  causes  which  are  to  a  large  extent 
different  in  the  two  cases.  To  exhibit,  therefore,  the  closeness  of  the 
relation  between  infantile  and  general  mortality,  for  such  causes 
as  show  marked  changes  between  one  year  and  the  next,  it  will  be 
best  to  proceed  by  correlating  the  annual  changes,  and  not  the  annual 
values.  The  work  would  be  arranged  in  the  following  form  (only 
sufficient  years  being  given  to  exhibit  the  principle  of  the  process), 
and  the  correlation  worked  out  between  the  figures  of  cols.  3  and  5. 


1. 

2. 

3. 

4. 

5. 

Infantile 

Increase  or 

General 

Increase  or 

Year. 

Mortality  per 

Decrease  from 

Mortality  per 

Decrease  from 

1000  Births. 

Year  before. 

1000  living. 

Year  before. 

1838 

159 

22-4 

1839 

151 

-8 

21-8 

-0-6 

1840 

154 

+3 

22-9 

-l-l-l 

1841 

145 

-9 

21-6 

-1-3 

1842 

152 

+7 

21-7 

+0-1 

1843 

150 

-2 

21-2 

-0-5 

For  the  period  to  which  the  diagram  refers,  viz.  1838-1904,  the 
following  constants  were  found  by  this  method  : — 

Infantile  mortality,  mean  annual  change  -  0*21 
standard  deviation  9*63 

General  mortality,  mean  annual  change  -  0'09 
standard  deviation  1*14 

Coefficient  of  correlation +  0*77. 

This  is  a  much  higher  correlation  than  would  arise  from  the 
mere  fact  that  the  deaths  of  infants  form  part  of  the  general 
mortality,  and  consequently  there  must  be  a  high  correlation 
between  the  annual  changes  in  the  mortality  of  those  who  are  over 
and  under  1  year  of  age.    {Cf.  Exercises  7  and  8,  Chap.  XI.) 

This  method,  which  appears  to  have  been  first  used  by  Miss 
Cave  and  by  Mr  Hooker  independently  in  the  papers  cited  in 
refs.  4  and  6,  has  recently  been  generalised  by  "Student"  and 
the  theory  fully  developed  by  0.  Anderson  {cf.  refs.  13,  14,  15). 
By  taking  the  first  differences  the  influence  of  the  slower  changes 
of  the  two  variables  with  time  may  not  be  wholly  eliminated, 
but  this  elimination  may  be  more  completely  effected  by  pro- 
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ceeding  to  the  second  differences,  i.e.  by  working  out  the  successive 
differences  of  the  differences  in  col.  3  and  in  col.  5  before  corre- 
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Fig.  41.— Infantile  and  General  Mortality  in  England  and  Wales,  1838-1904. 


lating.  It  may  even  be  desirable  to  proceed  to  third,  fourth  or 
higher  differences  before  correlating. 
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Fig.  42. — Marriage-rate  and  Foreign  Trade,  England  and  Wales,  1855-1904 


15.  Illustration  v. — The  two  curves  of  fig.  42  show  (1)  the 
marriage-rate  (persons  married  per  1000  of  the  population)  -  for 
England  and  Wales  ;  (2)  the  values  of  exports  and  imports  per 
head  of  the  population  of  the  United  Kingdom  for  every  year 
from  1855  to  1904.  Inspection  of  the  diagram  suggests  a  similar 
relation  to  that  of  the  last  example,  the  one  variable  showing  a 
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rise  from  one  year  to  the  next  wlien  the  other  rises,  and  a  fall 
when  the  other  falls.  The  movement  of  both  variables  is,  how- 
ever, of  a  much  more  regular  kind  than  that  of  mortality, 
resembling  a  series  of  "  waves  "  superposed  on  a  steady  general 
trend,  and  it  is  the  "  waves  "  in  the  two  variables — the  short-period 
movements,  not  the  slower  trends — which  are  so  clearly  related. 

16.  It  is  not  difficult,  moreover,  to  separate  the  short-period 
oscillations,  more  or  less  approximately,  from  the  slower  movement. 
Suppose  the  marriage-rate  for  each  year  replaced  by  the  average 
of  an  odd  number  of  years  of  which  it  is  the  centre,  the  number 
being  as  near  as  may  be  the  same  as  the  period  of  the  "  waves  " — 
e.g.  nine  years.  If  these  short-period  averages  were  plotted  on 
the  diagram  instead  of  the  rates  of  the  individual  years,  we  should 
evidently  obtain  a  smoother  curve  which  would  clearly  exhibit 
the  trend  and  be  practically  free  from  the  conspicuous  waves. 
The  excess  or  defect  of  each  annual  rate  above  or  below  the 
trend,  if  plotted  separately,  would  therefore  give  the  "waves" 
apart  from  the  slower  changes.  The  figures  for  foreign  trade 
may  be  treated  in  the  same  way  as  the  marriage-rate,  and  we 
can  accordingly  work  out  the  correlation  between  the  waves  or 
rapid  fluctuations,  undisturbed  by  the  movements  of  longer  period, 
however  great  they  may  be.  The  arithmetic  may  be  carried  out 
in  the  form  of  the  following  table,  and  the  correlation  worked  out 
in  the  ordinary  way  between  the  figures  of  columns  4  and  7. 


1. 

2. 

3. 

4. 

5. 

6. 

7. 

Marriage-rate 

Nine 

Differ, 
ence. 

Exports+Im- 

Nine 

Differ- 
ence. 

Year. 

(England 

Years' 

ports,  £'s  per 

Years' 

and  Wales). 

Average. 

head  (U.K.). 

Average. 

1855 

16-2 

9-36 

1856 

16-7 

11-14 

1857 

16-5 

11-85 

1858 

160 

10-73 

1859 

17-0 

16-5 

-f-0-5 

11-72 

12  15 

-0-43 

1860 

17-1 

16-6 

+  0-5 

13-03 

12-94 

-1-0-09 

1861 

16*3 

16-7 

-0  4 

13-01 

13-52 

-0-51 

1862 

16-1 

16-8 

-0-7 

13-40 

14-17 

-0-77 

1863 

16-8 

16-9 

-0-1 

15-13 

14-81 

-fO-32 

1864 

17-2 

16-43 

1865 

175 

16-37 

1866 

17-5 

17-72 

1867 

16-5 

16-47 

17.  Fig.  43  is  drawn  from  the  figures  of  columns  4  and  7,  and 
shows  very  well  how  closely  the  oscillations  of  the  marriage-rate 
are  related  to  those  of  trade.  For  the  period  1861-95  the 
correlation  between  the  two  oscillations  (Hooker,  ref.  5)  is  0-86. 
The  method  may  obviously  be  extended  by  correlating  the  devia 
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tion  of  the  marriage-rate  in  any  one  year  with  the  deviation  of 
the  exports  and  imports  of  the  year  before,  or  two  years  before, 
instead  of  the  same  year ;  if  a  sufficient  number  of  years  be 
taken,  an  estimate  may  be  made,  by  interpolation,  of  the  time- 
difference  that  would  make  the  correlation  a  maximum  if  it  were 
possible  to  obtain  the  figures  for  exports  and  imports  for  periods 
other  than  calendar  years.  Thus  Mr  Hooker  finds  (ref.  5)  that 
on  an  average  of  the  years  1861-95  the  correlation  would  be  a 
maximum  between  the  marriage-rate  and  the  foreign  trade  of 
about  one-third  of  a  year  earlier.  The  method  is  an  extremely 
useful  one,  and  is  obviously  applicable  to  any  similar  case.  The 
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Fig.  43.— Fluctuations  in  (1)  Marriage-rate  and  (2)  Foreign  Trade  (Exports 
+  Imports  per  head)  in  England  and  Wales  :  the  Curves  show  Deviations 
from  9-year  means.    Data  of  R.  H.  Hooker,  Jour.  Roy.  Stat.  Soe.,  1901. 

student  should  refer  to  the  paper  by  Mr  Hooker,  cited.  Reference 
may  also  be  made  to  ref.  10,  in  which  several  diagrams  are  given 
similar  to  fig.  43,  and  the  nature  of  the  relationship  between  the 
marriage-rate  and  such  factors  as  trade,  unemployment,  etc.,  is 
discussed,  it  being  suggested  that  the  relation  is  even  more 
complex  than  appears  from  the  above.  The  same  method  of 
separating  the  short-period  oscillations  was  used  at  an  earlier 
date  by  Poynting  in  ref.  16,  to  which  the  student  is  referred 
for  a  discussion  of  the  method. 

18.  It  was  briefly  mentioned  in  §  9  of  the  last  chapter  that 
the  treatment  of  cases  when  the  regression  was  non-linear  was, 
in  general,  somewhat  difficult.  Such  cases  lie  strictly  outside 
the  scope  of  the  present  volume,  but  it  may  be  pointed  out 
that  if  a  relation  between  X  and   Y  be  suggested,  either  by 
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theory  or  by  previous  experience,  it  may  be  possible  to  throw 
that  relation  into  the  form 

Y  =  A  +  B.cf,{X), 

where  A  and  B  are  the  only  imknown  constants  to  be  determined. 
Tf  a  correlation-table  be  then  drawn  up  between  Y  and  <^(X) 
instead  of  Y  and  X,  the  regression  will  be  approximately  linear. 
Thus  in  Table  V.  of  the  last  chapter,  if  X  be  the  rate  of 
discount  and  Y  the  percentage  of  reserves  on  deposits,  a 
diagram  of  the  curves  of  regression,  or  curves  on  which  the 
means  of  arrays  lie,  suggests  that  the  relation  between  X  and  Y 
is  approximately  of  the  form 

X{Y-B)  =  A, 
A  and  B  being  constants  ;  that  is, 

XY=A  +  BX. 
Or,  if  we  make  XY  a  new  variable,  say  Z, 

Z  =  A  +  BX, 

Hence,  if  we  draw  up  a  new  correlation-table  between  X  and  Z 
the  regression  will  probably  be  much  more  closely  linear. 
If  the  relation  between  the  variables  be  of  the  form 

Y^AB^ 

we  have 

log  F=log  ^  +  X  log  j?, 

and  hence  the  relation  between  log  Y  and  X  is  linear.  Similarly, 
if  the  relation  be  of  the  form 

we  have 

log  Y=  log  A-n.  log  X, 

and  so  the  relation  between  log  Y  and  log  X  is  linear.  By 
means  of  such  artifices  for  obtaining  correlation-tables  in 
which  the  regression  is  linear,  it  may  be  possible  to  do  a  good 
deal  in  difficult  cases  whilst  using  elementary  methods  only. 
The  advanced  student  should  refer  to  ref.  17  for  a  different 
method  of  treatment. 

19.  The  only  strict  method  of  calculating  the  correlation 
coefficient  is  that  described  in  Chapter  IX.  from  the  formula 

r  —  ^^^^^  .    Approximations   to  this  value   m^j,  however,  be 
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found  in  various  ways,  for  the  most  part  dependent  either  (1) 

on  the  formulae  for  the  two  regressions  r—  and  7-—,  or  (2)  on 

^«   

the  formulae  for  the  standard  deviations  of  the  arrays  o-^  Jl  -r^ 
and  a-y  J\  -  r^.  Such  approximate  methods  are  not  recommended 
for  ordinary  use,  as  they  will  lead  to  different  results  in  different 
liands,  but  a  few  may  be  given  here,  as  being  occasionally  useful 
for  estimating  the  value  of  the  correlation  in  cases  w^here  the 
data  are  not  given  in  such  a  shape  as  to  permit  of  the  proper 
calculation  of  the  coefficient. 

(1)  The  means  of  rows  and  columns  are  plotted  on  a  diagram, 
and  lines  fitted  to  the  points  by  eye,  say  by  shifting  about 
a  stretched  black  thread  until  it  seems  to  run  as  near  as  may 
be  to  all  the  points.  If  5j,  be  the  slopes  of  these  two  lines 
to  the  vertical  and  the  horizontal  respectively, 

Hence  the  value  of  r  may  be  estimated  from  any  such  diagram 
as  figs.  36-40  in  Chapter  IX.,  in  the  absence  of  the  original 
table.  Further,  if  a  correlation-table  be  not  grouped  by 
equal  intervals,  it  may  be  difficult  to  calculate  the  product 
sum,  but  it  may  still  be  possible  to  plot  approximately  a  diagram 
of  the  two  lines  of  regression,  and  so  determine  roughly  the 
value  of  r.  Similarly,  if  only  the  means  of  two  rows  and 
two  columns,  or  of  one  row  and  one  column  in  addition  to  the 
means  of  the  two  variables,  are  known,  it  will  still  be  possible 
to  estimate  the  slopes  of  RR  and  CC,  and  hence  the  correlation 
coefficient. 

(2)  The  means  of  one  set  of  arrays  only,  say  the  rows,  are 
calculated,  and  also  the  two  standard-deviations  a-^  and  a-y.  The 
means  are  then  plotted  on  a  diagram,  using  the  standard-deviation 
of  each  variable  as  the  unit  of  measurement,  and  a  line  fitted  by 
eye.  The  slope  of  this  line  to  the  vertical  is  r.  If  the  standard 
deviations  be  not  used  as  the  units  of  measurement  in  plotting, 
the  slope  of  the  line  to  the  vertical  is  rajay,  and  hence  r  will  be 
obtained  by  dividing  the  slope  by  the  ratio  of  the  standard- 
deviations. 

This  method,  or  some  variation  of  it,  is  often  useful  as  a 
makeshift  when  the  data  are  too  incomplete  to  permit  of  the 
proper  calculation  of  the  correlation,  only  one  line  of  regression 
and  the  ratio  of  the  dispersions  of  the  two  variables  being  required  : 
the  ratio  of  the  quartile  deviations,  or  other  simple  measures  of 
dispersion,  will  serve  quite  well  for  rough  purposes  in  lieu  of  the 
ratio  of  standard-deviations.    As  a  special  case,  we  may  note  that 
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if  the  two  dispersions  are  approximately  the  same,  the  slope  of 
RR  to  the  vertical  is  r. 

Plotting  the  medians  of  arrays  on  a  diagram  with  the  quartile  ] 

deviations  as  units,  and  measuring  the  slope  of  the  line,  was  the  i 
method  of  determining  the  correlation    coefficient  ("Galton's 

function  ")  used  by  Sir  Francis  Galton,  to  whom  the  introduction  ■ 
of  such  a  coefficient  is  due.    (Refs.  2-4  of  Chap.  IX.  p.  188.) 

(3)  If  Sj.  be  the  standard-deviation  of  errors  of  estimate  like  i 

X-  hyU^  we  have  from  Chap.  IX.  §  11-—  \ 

s.2  =  cT.2(l-r2),  \ 

and  hence  j 

\ 

But  if  the  dispersions  of  arrays  do  not  differ  largely,  and  the  ^ 

regression  is  nearly  linear,  the  value  of     may  be  estimated  from  | 

the  average  of  the  standard-deviations  of  a  few  rows,  and  r  deter-  \ 

mined — or  rather  estimated — accordingly.  Thus  in  Table  III.,  i 
Chap.  IX.,  the  standard-deviations  of  the  ten  columns  headed 

62-5-63-5,  63-5-64-5,  etc.,  are—  : 

2-56  2-26 

2-11                    2-26  \ 

2-55                   2-45  j 

2-24                   2-33  j 

2'23    ' 

2-60        T^Iean  2-359 

The  standard-deviation  of  the  stature  of  all  sons  is  2-75:  hence 

approximately  \ 

=  0-514.  j 

This  is  the  same  as  the  value  found  by  the  product-sum  method  i 

to  the  second  decimal  place.    It  would  be  better  to  take  an  ] 

average  by  counting  the  square   of   each  standard-deviation  j 

once  for   each   observation   in   the  column   (or   "  weighting "  i 

it  with  the  number  of  observations  in  the  column),  but  in  the  j 

present  case  this  would  only  lead  to  a  very  slightly  different  ! 

result,  viz.  s  =2-362,  r  =  0-512.  \ 

20.  The  Correlation  Ratio. — The  method  clearly  would  not  ■ 
give  an  approximation  to  the  correlation  coefficient,  however,  in 

the  case  of  such  tables  as  V.  and  VI.  of  Chap.  IX,,  in  which  the  : 

means  of  successive  arrays  do  not  lie  closely  round  straight  lines.  j 
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In  such  cases  it  would  always  tend  to  give  a  value  for  r  markedly 
higher  than  that  given  by  the  product-sum  method.  The 
product-sum  method  gives  in  fact  a  value  based  on  the  standard- 
deviation  round  the  line  of  regression ;  the  method  used  above 
gives  a  value  dependent  on  the  standard-deviation  round  a  line 
which  sweeps  through  all  the  means  of  arrays,  and  the  second 
standard-deviation  is  necessarily  less  than  the  first.  We  reach, 
therefore^  a  generalised  coefficient  which  measures  the  approach 
towards  a  curvilinear  line  of  regression  of  any  form. 

Let  Sax  denote  the  standard-deviation  of  any  array  of  X%  and 
let  n,  as  before,  be  the  number  of  observations  in  this  array  (Chap. 
IX.,  §  11),  and  further  let 

o-J  =  ^n.sJ)IJ^  .        .        .        .  (1) 

Then  o-o^  is  an  average  of  the  standard-deviations  of  the  arrays 
obtained  as  suggested  at  the  end  of  the  last  section.    Now  let 

<raJ'  =  <Kl-Vsy')     ....  (2) 

or 

.      .       ,       .  (3) 

Then  17^  is  termed  by  Professor  Pearson  a  correlation-ratio  (ref. 
18).  As  there  are  clearly  two  correlation-ratios  for  any  one  table, 
it  should  be  distinguished  as  the  correlation-ratio  of  X  on  Y :  it 
measures  the  approach  of  values  of  X  associated  with  given 
values  of  F  to  a  single-valued  relationship  of  any  form.  The 
calculation  would  be  exceedingly  laborious  if  we  had  actually  to 
evaluate  o-a^  but  this  may  be  avoided  and  the  work  greatly 
simplified  by  the  following  consideration.  If  denote  the  mean 
of  all  X's,  the  mean  of  an  array,  then  we  have  by  the  general 
relation  given  in  §  11  of  Chap.  VIII.  (p.  142) 

Or,  using  o-^  to  denote  the  standard-deviation  of  lUj, , 

Hence,  substituting  in  (3) 

V.y=="^  (5) 

The  correlation-ratio  of  X  on  Y  is  therefore  determined  when  we 
have  found,  in  addition  to  the  standard-deviation  of  X,  the 
standard-deviation  of  the  means  of  its  arrays. 

21.  The  correlation-ratio  of  X  on  Y  cannot  be  less  than  the 
correlation-coefficient  for  X  and  F,  and  rj^J^  -r^  is  a  measure  of 
the  divergence  of  the  regression  of  X  on  F  from  linearity.  For 
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if  d  denote,  as  in  Chap.  IX.,  the  deviation  of  the  mean  of  an 
array  of  Xs  from  the  line  of  regression,  we  have  by  the  relation 
of  Chap.  IX.,  §  11,  p.  172 

o-.2(l-r2)  =  o-„/  +  o-/.       .       .       .  (6) 

Substituting  for  ar^x  from  (2),  that  is, 

o-/  =  cr.W-^')  ....  (7) 
But  o-rf  is  necessarily  positive,  and  therefore  tj^y  is  not  less  than  r. 
The  magnitude  of  a-^  and  therefore  of  yf-  -  measures  the 
divergence  of  the  actual  line  through  the  means  of  arrays  from 
the  line  of  regression. 

It  should  be  noted  that,  owing  to  the  fluctuations  of  sampling, 
r  and  y]  are  almost  certain  to  differ  slightly,  even  though  the 
regression  may  be  truly  linear.  The  observed  value  of  rf-r^ 
must  be  compared  with  the  values  that  may  arise  owing  to 
fluctuations  of  sampling  alone,  before  a  definite  significance  can 
be  ascribed  to  it  (c/.  Pearson,  ref.  19,  Blakeman,  ref.  22,  and  the 
formulae  cited  therefrom  on  p.  352  below). 

22.  The  following  table  illustrates  the  form  of  the  arithmetic 
for  the  calculation  of  the  correlation-ratio  of  son's  stature  on 
father's  stature  (Table  III.  of  Chap.  IX.,  p.  160).  In  the  first 
column  is  given  the  type  of  the  array  (stature  of  father) ;  in  the 
second,  the  mean  stature  of  sons  for  that  array ;  in  the  third,  the 
diff'erence  of  the  mean  of  the  array  from  the  mean  stature  of  all 
sons.  In  the  fourth  column  these  differences  are  squared,  and  in 
the  sixth  they  are  multiplied  by  the  frequency  of  the  array,  two 
decimal  places  only  having  been  retained  as  suflacient  for  the 
present  purpose.  The  sum  total  of  the  last  column  divided  by 
the  number  of  observations  (1078)  gives  o-„,y2  =  2  058,  or  cr^j,=  1*43. 
As  the  standard-deviation  of  the  sons'  stature  is  2*75  in.  (c/. 
Chap.  IX.,  question  3),  7/^^  =  0*52,  Before  taking  the  differences 
for  the  third  column  of  such  a  table,  it  is  as  well  to  check  the 
means  of  the  arrays  by  recalculating  from  them  the  mean  of  the 
whole  distribution,  i.e.  multiplying  each  array-mean  by  its  fre- 
quency, summing,  and  dividing  by  the  number  of  observations. 
The  form  of  the  arithmetic  may  be  varied,  if  desired,  by  working 
from  zero  as  origin,  instead  of  taking  differences  from  the  true 
mean.  The  square  of  the  mean  must  then  be  subtracted  from 
2(/.m,2)/iVtogiveo-,/. 

If  the  second  correlation-ratio  for  this  table  be  worked  out  in 
the  same  way,  the  value  will  be  found  to  be  the  same  to  the 
second  place  of  decimals  :  the  two  correlation-ratios  for  this  table 
are,  therefore,  very  nearly  identical,  and  only  slightly  greater 
than   the   correlation-coefl&cient   (0*51).     Both   regressions,  it 
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follows  from  the  last  section,  are  very  nearly  linear,  a  result 
confirmed  by  the  diagram  of  the  regression  lines  (fig.  37,  p.  174). 
On  the  other  hand,  it  is  evident  from  fig.  39,  p.  176,  that  we 
should  expect  tlie  two  correlation-ratios  for  Table  VI.  of  the  same 
chapter  to  differ  considerably  from  each  other  and  from  the  correla- 
tion. The  values  found  are  ry^,  =  0-U,  7;,,  =  0-38  (r=  -O'OU): 
rjj^  is  comparatively  low  as  proportions  of  male  births  differ  little 
in  the  successive  arrays,  but  77^^  is  higher  since  the  line  of  regres- 
sion of  Fon  Xis  sharply  curved.  For  Table  VIII.,  p.  183,  the 
two  ratios  are  rj^j  =  0-4:Q,  rjy^  =  0  S9  (r  =  0'34).  The  confirmation 
of  these  values  is  left  to  the  student. 

The  student  should  notice  that  the  correlation-ratio  only 
affords  a  satisfactory  test  when  the  number  of  observations  is 
sufficiently  large  for  a  grouped  correlation  table  to  be  fofmed. 
In  the  case  of  a  short  series  of  observations  such  as  that  given  in 
Table  VII.,  p.  178,  the  method  is  inapplicable. 


Calculation  of  the  Correlation-Ratio:  Example. — Son's  Stature  on 
Father's  Stature:  Data  of  Table  III.,  Chap.  IX.,  p.  160. 


1. 

Type  of 

Array 
( Father's 
Stature). 

2. 

Mean  of 
Array 
(Son's 

Stature). 

3. 

Difference 
from  Mean 
of  all  Sons 
(68-66). 

4. 

Square  of 
Ditierence. 

5. 

Frequency. 

6. 

Frequency  x 
(difference)'-^. 

59 

64-67 

-3-99 

15-9201 

3 

47-76 

60 

65-64 

-3-02 

9-1204 

3-5 

31  -92 

61 

66-34 

-2-32 

5-3824 

8 

43-06 

62 

65-56 

-310 

9-6100 

17 

163-37 

63 

66-68 

-1-98 

3-9204 

33-5 

131-33 

64 

66-74 

-1-92 

3-6864 

61-5 

226-71 

65 

67-19 

-1-47 

2-1609 

95-5 

206-37 

66 

67-61 

-1-05 

1-1025 

142 

156-56 

67 

67-95 

-0-71 

0-5041 

137-5 

69-31 

68 

69-07 

+  0-41 

0-1681 

154 

25-89 

69 

69-39 

+  0-73 

0-5329 

141-5 

75-41 

70 

69-74 

+  1-08 

1-1664 

116 

135-30 

71 

70-50 

-M-84 

3-3856 

78 

264-08 

72 

70-87 

-I-2-21 

4-8841 

49 

239-32 

73 

72-00 

+  3-34 

11-1556 

28-5 

317-93 

74 

71-50 

+  2-84 

8-0656 

4 

32-26 

75 

71-73 

-h3-07 

9-4249 

5-5 

51-84 

Total 

1078 

2218-42 

(rw/  =  221 8 -42/1078  =  2  058       <rr«v  =  l'43 
7jj,:r=  1-43/2 -75  =  0-52. 
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MISCELLANEOUS  THEOREMS  INVOLVING  THE  USE  OF 
THE  CORRELATION-COEFFICIENT. 

1.  Introductory— 2.  Standard-deviation  of  a  sum  or  difference— 3-5.  In- 
fluence of  errors  of  observation  and  of  grouping  on  the  standard- 
deviation — 6-7.  Influence  of  errors  of  observation  on  the  correlation- 
coefficient  (Spearman's  theorems)— 8.  Mean  and  standard-deviation 
of  an  index  — 9.  Correlation  between  indices  — 10.  Correlation- 
coefficient  for  a  two-  X  two-fold  table — 11.  Correlation-coefficient 
for  all  possible  pairs  of  iV  values  of  a  variable — 12,  Correlation  due 
to  heterogeneity  of  material — 13.  Reduction  of  correlation  due  to 
mingling  of  uncorrelated  with  correlated  material  — 14-17.  The 
weighted  mean — 18-19.  Application  of  weighting  to  the  correction 
of  death-rates,  etc.,  for  varying  sex  and  age-distributions— 20.  The 
weighting  of  forms  of  average  other  than  the  arithmetic  mean. 

1.  It  has  already  been  pointed  out  that  a  statistical  measure,  if 
it  is  to  be  widely  useful,  should  lend  itself  readily  to  algebraical 
treatment.  The  arithmetic  mean  and  the  standard-deviation 
derive  their  importance  largely  from  the  fact  that  they  fulfil  this 
requirement  better  than  any  other  averages  or  measures  of  dis- 
persion ;  and  the  following  illustrations,  while  giving  a  number  of 
results  that  are  of  value  in  one  branch  or  another  of  statistical 
work,  suffice  to  show  that  the  correlation-coefficient  can  be  treated 
with  the  same  facility.  This  might  indeed  be  expected,  seeing 
that  the  coefficient  is  derived,  like  the  mean  and  standard-devia- 
tion, by  a  straightforward  process  of  summation. 

2.  To  find  the  Standard-deviation  of  the  sum  or  difference  Z  of 
corresponding  values  of  two  variables       and  X^. 

Let  z,  denote  deviations  of  the  several  variables  from 

their  arithmetic  means.    Then  if 

Z  =  X^±  Xg, 

evidently 
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Squaring  both  sides  of  the  equation  and  summing, 

That  is,  if  r  be  the  correlation  between  x\  and  x.^,  and  cr,  o-j,  o-g  ; 
the  respective  standard-deviations,  ^ 

o-2  =  aj2  +  u-./+ 2r.cr^cr.,         .        .         •  (1) 

If  x^  and  x.2  are  uncorrected,  we  have  the  important  special  case 

o-^^.r/^  +  o-^'-^       ....  (2) 

The  student  should  notice  that  in  this  case  the  standard- 
deviation  of  the  sum  of  corresponding  values  of  the  two  variables 
is  the  same  as  the  standard-deviation  of  their  difference. 

The  same  process  will  evidently  give  the  standard-deviation  of  a 
linear  function  of  any  number  of  variables.    For  the  sum  of  a  ■ 
series  of  variables  Xj,        .  .  .  .  X,^  we  must  have 

+   .   .   .  .   -f-2r^3.cr,o-3-f-  .... 

r^2  being  the  correlation  beween  X^  and        r.^-^  the  correlation 
between  X^  and  JTg,  and  so  on.  , 

3.  InjiueiLce  of  Errors  of  Observation  on  the  Standm'd-deviation.  I 
— The  results  of  §  2  may  be  applied  to  the  theory  of  errors  of  J 
observation.    Let  us  suppose  that,  if  a7iy  value  of  X  be  observed 

a  large  number  of  times,  the  arithmetic  mean  of  the  observations 
is  approximately  the  true  value,  the  arithmetic  mean  error  being 
zero.  Then,  the  arithmetic  mean  error  being  zero  for  all  values 
of  X,  the  error,  say  8,  is  uncorrected  with  X.  In  this  case  if  x^  be 
an  observed  deviation  from  the  arithmetic  mean,  x  the  true  devia-  j 
tion,  we  have  from  the  preceding 

<r.^  =  T.^  +  <T,^      .        .        .        .  (3) 

The  effect  of  errors  of  observation  is,  consequently,  to  increase  the 
standard-deviation  above  its  true  value.  The  student  should 
notice  that  the  assumption  made  does  not  imply  the  complete  in-  \ 
dependence  of  X  and  8 :  he  is  quite  at  liberty  to  suppose  that 
errors  fluctuate  more,  for  example,  with  large  than  with  small 
values  of  X,  as  might  very  probably  happen.  In  that  case  the 
contingency-coefi&cient  between  X  and  8  would  not  be  zero, 
although  the  correlation-coefficient  might  still  vanish  as  supposed. 

4.  Influence  of  Grouping  on  iht  Standard-deviation. — The 
consequence  of  grouping  observations  to  form  the  frequency 
distribution  is  to  introduce  errors  that  are,  in  effect,  errors  of  ' 
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measurement.  Instead  of  assigning  to  any  observation  its  true 
value  X,  we  assign  to  it  the  value  Xj  corresponding  to  the  centre 
of  the  class-interval,  thereby  making  an  error  8,  where 

To  deduce  from  this  equation  a  formula  showing  the  nature  of 
the  influence  of  grouping  on  the  standard-deviation  we  must  know 
the  correlation  between  the  error  8  and  X  or  Xy  If  the  original 
distribution  were  a  histogram,  JTj  and  8  would  be  uncorrelated, 
the  mean  value  of  8  being  zero  for  every  value  of  X^ :  further,  the 
square  of  the  standard-deviatiofi  of  8  would  be  c2/12,  where  c  is 
the  class-interval  (Chap.  VIII.  §  12,  eqn.  (10)).  Hence,  if  o-^  be  the 
standard-deviation  of  the  grouped  values  X^  and  a  the  standard- 
deviation  of  the  true  values  X, 

But  the  true  frequency  distribution  is  rarely  or  never  a 
histogram,  and  trial  on  any  frequency  distribution  approximating 
to  the  symmetrical  or  slightly  asymmetrical  forms  of  fig.  5,  p.  89, 
or  fig.  9  (a),  p.  92,  shows  that  grouping  tends  to  increase  rather 
than  reduce  the  standard-deviation.  If  we  assume,  as  in  §  3,  that 
the  correlation  between  8  and  X,  instead  of  8  and  X-^,  is  appreciably 
zero  and  that  the  standard-deviation  of  8  may  be  taken  as 
as  before  (the  values  of  8  being  to  a  first  approximation  uniformly 
distributed  over  the  class-interval  when  all  the  intervals  are 
considered  together),  then  we  have 

,  <^'=<^i'-|^    •    •    •    •  w 

This  is  a  formula  of  correction  for  grouping  (Sheppard's  correc- 
tion, refs.  1  to  4)  that  is  very  frequently  used,  and  that  trial 
(ref.  1)  shows  to  give  very  good  results  for  a  curve  approximating 
closely  to  the  form  of  fig.  5,  p.  89.  The  strict  proof  of  the 
formula  lies  outside  the  scope  of  an  elementary  work  :  it  is  based 
on  two  assumptions:  (1)  that  the  distribution  of  frequency  is 
continuous,  (2)  that  the  frequency  tapers  oft'  gradually  to  zero 
in  both  directions.  The  formula  would  not  give  accurate  results 
in  the  case  of  such  a  distribution  as  that  of  fig.  9  (b),  p.  92,  or 
fig.  14,  p.  97,  neither  is  it  applicable  at  all  to  the  more  divergent 
forms  such  as  those  of  figs.  15,  et  seq. 

5.  If  certain  observations  be  repeated  so  that  we  have  in  every 
case  two  measures  x-^  and  of  the  same  deviation  x^  it  is  possible 
to  obtain  the  true  standard-deviation  if  the  further  assumption 
is  legitimate  that  the  errors  8^  and  82  are  uncorrelated  with  each 
other.    On  this  assumption 
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and  accordingly 

^J^  =  ^      ....  (.5) 

(This  formula  is  part  of  Spearman's  formula  for  the  correction  of 
the  correlation-coefficient,  cf.  §  7.) 

6.  Influence  of  Errors  o  f  Ohservation  on  the  Correlation-coeflicient. 
— Let      y J  be  the  observed  deviations  from  the  arithmetic  means, 

y  the  true  deviations,  and  8,  c  the  errors  of  observation.  Of 
the  four  quantities  x^  8,  c  we  will  suppose  x  and  y  alone  to 
be  correlated.    On  this  assumption 

%{x^y^)  =  %{xy)     .       .       ,       ,  (6) 

It  follows  at  once  that 


and  consequently  the  observed  correlation  is  less  than  the  true 
correlation.  This  difference,  it  should  be  noticed,  no  mere  increase 
in  the  number  of  observations  can  in  any  way  lessen. 

7.  Spearman^s  Theorems. — If,  however,  the  observations  of  both 
x  and  y  be  repeated,  as  assumed  in  §  5,  so  that  we  have  two 
measures  and  x^-,  yi  and  y^  of  every  value  of  x  and  y,  the  true 
value  of  the  correlation  can  be  obtained  by  the  use  of  equations 
(5)  and  (6),  on  assumptions  similar  to  those  made  above.  For 
we  have 

%{x^x^P(y^y^)  %{x^x^)^{y^y^) 

il) 


r 


Or,  if  we  use  all  the  four  possible  correlations  between  observed 
values  of  x  and  observed  values  of 


r  r 


Equation  (8)  is  the  original  form  in  which  Spearman  gave  his 
correction  formula  (refs.  6,  7).  It  will  be  seen  to  imply  the 
assumption  that,  of  the  six  quantities  a;,  y,  8j,  Sg?  ^2'  ^  ^ 
alone  are  correlated.  The  correction  given  by  the  second  part 
of  equation  (7),  also  suggested  by  Spearman,  seems,  on  the 
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whole,  to  bo  safer,  for  it  eliminates  the  assumption  that  the  errors 
in  X  and  in  y,  in  the  same  series  of  observations,  are  uncorrelated. 
An  insufficient  though  partial  test  of  the  correctness  of  the 
assumptions  may  be  made  by  correlating  ~  with  y^-y^'-  this 
correlation  should  vanish.  Evidently,  however,  it  may  vanish 
from  symmetry  without  thereby  implying  that  all  the  correlations 
of  the  errors  are  zero. 

8.  Mean  and  Standard-deviation  of  an  Index. — (Ref.  11.)  The 
means  and  standard-deviations  of  non-linear  functions  of  two  or 
more  variables  can  in  general  only  be  expressed  in  terms  of  the 
means  and  standard-deviations  of  the  original  variables  to  a  first 
approximation,  on  the  assumption  that  deviations  are  small 
compared  with  the  mean  values  of  the  variables.  Thus  let  it  be 
required  to  find  the  mean  and  standard-deviation  of  a  ratio  or 
index  Z^X^jX^,  in  terms  of  the  constants  for  and  Xr,.  Let  / 
be  the  mean  of  and       the  means  of  Xj  and  Xg.  Then 


Expand  the  second  bracket  by  the  binomial  theorem,  assuming 
that  xjM^  is  so  small  that  powers  higher  than  the  second  can 
be  neglected.    Then  to  this  approximation 

That  is,  if  r  be  the  correlation  between  x-^  and  x<^,  and  if  v^  =  a  J  31^, 
V^^a-JM^, 

r-§il-rv,v,^v,^)  ...  (9) 

If  5  be  the  standard-deviation  of  Z  we  have 

Expanding  the  second  bracket  again  by  the  binomial  theorem, 
and  neglecting  terms  of  all  orders  above  the  second, 


XI. — CORRELATION:  MISCELLANEOUS  THEOREMS.  215 


or  from  (9) 

9.  Correlation  between  Indices. — (Ref.  11.)  The  following  prob- 
lem affords  a  further  illustration  of  the  use  of  the  same  method. 
Required  to  find  approximately  the  correlation  between  two  ratios 
i^j  =  X]/Xo,      =  XJX^j  Xj       and  X^  being  uncorr elated. 

Let  the  means  of  the  two  ratios  or  indices  be  /j  /g  and  the 
standard-deviations  Sj  s^ ;  these  are  given  approximately  by  (9) 
and  (10)  of  the  last  section.  The  required  correlation  p  will  be 
given  by 

#.„v.=<f;-/,)g-/,) 

Neglecting  terms  of  higher  order  than  the  second  as  before  and 
remembering  that  all  correlations  are  zero,  we  have 

Mi  ' 

where,  in  the  last  step,  a  term  of  the  order  v.^  has  again  been 
neglected.    Substituting  from  (10)  for  Sj  and  s^,  we  have  finally — 

This  value  of  p  is  obviously  positive,  being  equal  to  0  5  if 
v^  =  V2  =  v^;  and  hence  even  if  Xj  and  X2  are  independent,  the  in- 
dices formed  by  taking  their  ratios  to  a  common  denominator  X^  will 
be  correlated.  The  value  of  p  is  termed  by  Professor  Pearson  the 
"spurious  correlation."  Thus  if  measurements  be  taken,  say,  on 
three  bones  of  the  human  skeleton,  and  the  measurements  grouped 
in  threes  absolutely  at  random,  there  will,  nevertheless,  be  a 
positive  correlation,  probably  approaching  0*5,  between  the  indices 
formed  by  the  ratios  of  two  of  the  measurements  to  the  third.  To 
give  another  illustration,  if  two  individuals  both  observe  the  same 
series  of  magnitudes  quite  independently,  there  may  be  little,  if 
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any,  correlation  between  their  absolute  errors.  But  if  the  errors 
be  expressed  as  percentages  of  the  magnitude  observed,  there 
may  be  considerable  correlation.  It  does  not  follow  of  necessity 
that  the  correlations  between  indices  or  ratios  are  misleading. 
If  the  indices  are  uncorrelated,  there  will  be  a  similar  "spurious" 
correlation  between  the  absolute  measurements  ZyX^^  —  X-^  and 
Z^-X^  =  X^,  and  the  answer  to  the  question  whether  the  correlation 
between  indices  or  that  between  absolute  measures  is  misleading 
depends  on  the  further  question  whether  the  indices  or  the 
absolute  measures  are  the  quantities  directly  determined  by  the 
causes  under  investigation  (cf.  ref.  13). 

The  case  considered,  where  X^  X^  are  uncorrelated,  is  only 
a  special  one;  for  the  general  discussion  c/.  ref.  11.  For  an  in- 
teresting study  of  actual  illustrations  cf.  ref.  14. 

10.  The  Correlation-coeficient  for  a  two-  x  two-fold  Table. — The 
correlation-coefficient  is  in  general  only  calculated  for  a  table  with 
a  considerable  number  of  rows  and  columns,  such  as  those  given 
in  Chapter  IX.  In  some  cases,  however,  a  theoretical  value  is 
obtainable  for  the  coefficient,  which  holds  good  even  for  the  limiting 
case  when  there  are  only  two  values  possible  for  each  variable  {e.g. 
0  and  1)  and  consequently  two  rows  and  two  columns  (c/.  one  illus- 
tration in  §  11,  and  for  others  the  references  given  in  questions  11 
and  12).  It  is  therefore  of  some  interest  to  obtain  an  expression 
for  the  coefficient  in  this  case  in  terms  of  the  class-frequencies. 

Using  the  notation  of  Chapters  I.-IV.  the  table  may  be  written 
in  the  form 


Values  of 

Second 
Variable. 

Values  of  First  Variable. 

Total 

(AB) 

iaB) 

(B) 

X's 

(3) 

Total 

(A) 

(a) 

N 

Taking  the  centre  of  the  table  as  arbitrarj^  origin  and  the 
class-interval,  as  usual,  as  the  unit,  the  co-ordinates  of  the 


mean  are 
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The  standard-deviations  are  given  by 

(ri2  =  0-25-|2  =  (^)(a)/7\r2 
<T,^  =  Q'2b-f=^{B){P)im. 

Finally, 

IX^y)  =  i{{AB)  +  (a/3)  -  (AP)  -  (a/J)}  -  N^^. 

Writing 

(^5)-(i)(B)/iVr=8 

(as  in  Chap.  III.  §§  11-12)  and  replacing  |,  77  by  their  vabies, 
this  reduces  to 

Whence 

^  -  J{A){a){m(3)'         '         '         '     ^  ^ 

This  value  of  r  ran  be  used  as  a  coefficient  of  association,  but, 
unlike  the  association-coefficient  of  Chap.  HI.  §  13,  which  is 
unity  if  either  {AB)  =  (A)  or  (AB)  =  (B),  r  only  becomes  unity  if 
(AB)  =  (A)  —  (B).  This  is  the  only  case  in  which  both  frequencies 
(aB)  and  {A/3)  can  vanish  so  that  (AB)  and  (a/?)  correspond  to 
the  frequencies  of  two  points  F|,  JTg  on  a  line.  Obviously 
this  alone  renders  the  numerical  values  of  the  two  coefficients 
quite  incomparable  with  each  other.  But  further,  while  the 
association  coefficient  is  the  same  for  all  tables  derived  from  one 
another  by  multiplying  rows  or  columns  by  arbitrary  coefficients, 
the  correlation  coefficient  (12)  is  greatest  when  (^)=-  (a)  and 
{B)  =  (13),  i.e.  when  the  table  is  symmetrical,  and  its  value  is 
lowered  when  the  symmetrical  table  is  rendered  asymmetrical  by 
increasing  or  reducing  the  number  of  A'a  or  ^'s.  For  moderate 
degrees  of  association,  the  association  coefficient  gives  much  the 
larger  values.  The  two  coefficients  possess,  in  fact,  essentially 
different  properties,  and  are  dif event  measures  of  association  in 
the  same  sense  that  the  geometric  and  arithmetic  means  are 
different  forms  of  average,  or  the  interquartile  range  and  the 
standard-deviation  different  measures  of  dispersion. 

The  student  is  again  referred  to  ref.  3  of  Chap.  III.  for  a 
general  discussion  of  various  measures  of  association,  including 
these  and  others,  that  have  been  proposed. 

11.  The  Correlation-coefficient  for  all  possible  pairs  of  JV  values 
of  a  Variable. — In  certain  cases  a  correlation-table  is  formed  by 
combining  iV^  observations  in  pairs  in  all  possible  ways.  If,  for 
example,  a  table  is  being  formed  to  illustrate,  say,  the  correlation 
between  brothers  for  stature,  and  there  are  three  brothers  in 
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one  family  with  statures  5  ft.  9,  5  ft.  10,  and  5  ft,  11,  these  are 
regarded  as  giving  the  six  pairs 

5  ft.  9  with  5  ft.  10  5  ft.  10  with  5  ft.  9 

„    5  ft.  11  5  ft.  11  „ 

5  ft.  10  „        „  „        „    5  ft.  10 

which  may  be  entered  into  the  table.  The  entire  table  will  be 
formed  from  the  aggregate  of  such  subsidiary  tables,  each  due  to 
one  family.  Let  it  be  required  to  find  the  correlation-coefficient, 
however,  for  a  single  subsidiary  table,  due  to  a  family  with  JV 
members,  the  numbers  of  pairs  being  therefore  iV^(iV-  1). 

As  each  observed  value  of  the  variable  occurs  JV—1  times, 
i.e.  once  in  combination  with  every  other  value,  the  means  and 
standard-deviations  of  the  totals  of  the  correlation-table  are  the 
same  as  for  the  original  iV  observations,  say  M  and  a.    If  cc^ 

....  he  the  observed  deviations,  the  product  sum  may  be 
written 

X-^X^  "f-  X-^Xr^  "I"  X-^X^       •   •   9  « 
X^X-^  -f-  Xe^X^  ■\-  ^2*^4  "4"   •    »   »  • 
-|-  X^X^  -f-  X^X2  "|-  X^X^       •    •    .  » 

+  

—  —  x^^  —  x^^  —  x^  -  .  .  .  .   =  —  iVcr^, 
whence,  there  being  N{N  ~  1)  pairs, 

No-''  1 


.  (13) 


For  N=  2,  3,  4  ...  .  this  gives  the  successive  values  of  r=  - 1, 
"h  ~  \  '  '  '  '  ^^  "^^  clear  that  the  first  value  is  right,  for  two 
values  x^,  x^  only  determine  the  two  points  {x^,  x^)  and  (x^,  x^), 
and  the  slope  of  the  line  joining  them  is  negative. 

The  student  should  notice  that  a  corresponding  negative 
association  will  arise  between  the  first  and  second  member  of  the 
pair  if  all  possible  pairs  are  formed  in  a  mixture  of  ^'s  and  a's. 
Looking  at  the  association,  in  fact,  from  the  standpoint  of  §  10, 
the  equation  (13)  still  holds,  even  if  the  variables  can  only  assume 
two  values,  e.g.  0  and  1.  This  result  is  utilised  in  §  14  of  Chapter 
XIV. 

12.  Correlation  due  to  Heterogeneity  of  Material. — The  following 
theorem  offers  some  analogy  with  the  theorem  of  Chap.  IV. 
§  6  for  attributes. — If  X  and  Y  are  uncorrelated  in  each  of  two 
recoi^ds,  they  will  nevertheless  exhibit  some  correlation  when  the 
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tivo  records  are  mingled,  unless  the  mean  value  of  X  in  the 
second  record  is  identical  with  that  in  the  first  record,  or  the  mean 
value  of  Y  in  the  second  record  is  identical  tvith  that  in  the  first 
record,  or  both. 

This  follows  almost  at  once,  for  if  M^,  are  the  mean  values  of 
X  in  the  two  records  K^,  K^,  the  mean  values  of  Y,  iVp  the 
numbers  of  observations,  and  M,  K  the  means  when  the  two 
records  are  mingled,  the  product-sum  of  deviations  about  M,  K  is 
h\  {M,  -  M){K,  -  70  +  Jf,(M,  -  J/)(A',  -  JT). 

Evidently  the  first  term  can  only  be  7,ero  if  M  =  J/j  or  K  =  Ky 
But  the  first  condition  gives 

that  is,  M^  =  M^. 

Similarly,  the  second  condition  gives  =  K^-  Both  the  first 
and  second  terms  can,  therefore,  only  vanish  if  =  or 
/f  J  =  K^.  Correlation  may  accordingly  be  created  by  the  mingling 
of  two  records  in  which  X  and  Y  vary  round  different  means. 
(For  a  more  general  form  of  the  theorem  cf.  ref.  20.) 

13.  Reduction  of  Correlation  due  to  mingling  of  uncorrelated 
with  correlated  pairs. — Suppose  that  observations  of  x  and  y 
give  a  correlation-coefficient 

Now  let  Tig  pairs  be  added  to  the  material,  the  means  and 
standard-deviations  of  x  and  y  being  tiie  same  as  in  the  first 
series  of  observations,  but  the  correlation  zero.  The  value  of 
%{xy)  will  then  be  unaltered,  and  we  will  have 

t{xy) 


Whence  ''a^  J_    ,       .       ^       .  (U) 

rj    n^  +  n^  ^ 

Suppose,  for  example,  that  a  number  of  bones  of  the  human 
skeleton  have  been  disinterred  during  some  excavations,  and 
a  correlation  r^  is  observed  between  pairs  of  bones  presumed 
to  come  from  the  same  skeleton,  this  correlation  being  rather 
lower  than  might  have  been  expected,  and  subject  to  some 
uncertainty  owing  to  doubts  as  to  the  allocation  of  certain 
bones.  If  r^  is  the  value  that  would  be  expected  from  other 
records,  the  difference  might  be  accounted  for  on  the  hypothesis 
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that,  in  a  proportion  {r-^-r^Jr^  of  all  the  pairs,  the  bones  do 
not  really  belong  to  the  same  skeleton,  and  have  been  virtually 
paired  at  random.  (For  a  more  general  form  of  the  theorem  cf. 
again  ref.  20.) 

14.  The  Weighted  Mean.— The  arithmetic  mean  31  of  a  series 
of  values  of  a  variable  X  was  defined  as  the  quotient  of  the  sum 
of  those  values  by  their  number  iV,  or 

If,  on  the  other  hand,  we  mnltiply  each  several  observed 
value  of  X  by  some  numerical  coefficient  or  weight   W,  the 
quotient  of  the  sum  of  such  products  by  the  sum  of  the  weights 
is  defined  as  a  weighted  mean  of  X,  and  may  be  denoted  by  M' 
so  that 

M' =^%{W.X)I%{W). 

The  distinction  between  weighted  "  and  "  unweighted  "  means 
is,  it  should  be  noted,  very  often  formal  rather  than  essential, 
for  the  "  weights "  may  be  regarded  as  actual,  estimated,  or 
virtual  frequencies.  The  weighted  mean  then  becomes  simply 
an  arithmetic  mean,  in  which  some  new  quantity  is  regarded 
as  the  unit.  Thus  if  we  are  given  the  means  J/j,  J/2,  .... 
Mr  of  r  series  of  observations,  but  do  not  know  the  number 
of  observations  in  every  series,  we  may  form  a  general  average 
by  taking  the  arithmetic  mean  of  all  the  means,  viz.  ^(M)/r, 
treating  the  series  as  the  unit.  But  if  we  know  the  number 
of  observations  in  every  series  it  will  be  better  to  form  the 
weighted  mean  '^{NM)I%{N),  weighting  each  mean  in  proportion 
to  the  number  of  observations  in  the  series  on  which  it  is  based. 
The  second  form  of  average  would  be  quite  correctly  spoken 
of  as  a  weighted  mean  of  the  means  of  the  several  series  :  at 
the  same  time  it  is  simply  the  arithmetic  mean  of  all  the 
series  pooled  together,  i.e.  the  arithmetic  mean  obtained  by 
treating  the  observation  and  not  the  series  as  the  unit. 
(Chap.  VII.  §  13.) 

15.  To  give  an  arithmetical  illustration,  if  a  commodity  is  sold 
at  different  prices  in  different  markets,  it  will  be  better  to  form 
an  average  price,  not  by  taking  the  arithmetic  mean  of  the  several 
market  prices,  treating  the  market  as  the  unit,  but  by  weighting 
each  price  in  proportion  to  the  quantity  sold  at  that  price,  if 
known,  i.e.  treating  the  unit  of  quantity  as  the  unit  of  frequency. 
Thus  if  wheat  has  been  sold  in  market  A  at  an  average  price  of 
29s.  Id.  per  quarter,  in  market  B  at  an  average  price  of  27s.  7d., 
and  in  market  G  at  an  average  price  of  28s.  4d.,  we  may,  if  no 
statement  is  made  as  to  the  quantities  sold  at  these  prices  (as  very 
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often  happens  in  the  case  of  statements  as  to  market  prices),  take 
the  arithmetic  mean  (28s.  4d,)  as  the  general  average.  But  if  we 
know  that  23,930  qrs.  were  sold  at  A,  only  26  qrs.  at  B,  and  3933 
qrs.  at  C,  it  will  be  better  to  take  the  weighted  mean 

(29s.  Id.  X  23,930)  +  (27s.  7d.  x  26)  +  (28s.  4d.  x  3933) 
27889 

to  the  nearest  penny.  This  is  apprecia)>ly  higher  than  the 
arithmetic  mean  price,  which  is  lowered  by  the  undue  importance 
attached  to  the  small  markets  B  and  C. 

In  the  case  of  index-numbers  for  exhibiting  the  changes  in 
average  prices  from  year  to  year  (cf.  Chap.  VIL  §  25),  it  may 
make  a  sensible  difference  whether  we  take  the  simple  arithmetic 
mean  of  the  index-numbers  for  diflerent  commodities  in  any  one 
year  as  representing  the  price-level  in  that  year,  or  weight  the 
index-numbers  for  the  several  commodities  according  to  their 
importance  from  some  point  of  view  ;  and  much  has  been  written 
as  to  the  weights  to  be  chosen.  If,  for  example,  our  standpoint 
be  that  of  some  average  consumer,  we  may  take  as  the  weight  for 
each  commodity  the  sum  which  he  spends  on  that  commodity  in 
an  average  year,  so  that  the  frequency  of  each  commodity  is 
taken  as  the  number  of  shillings  or  pounds  spent  thereon  instead 
of  simply  as  unity. 

Rates  or  ratios  like  the  birth-,  death-,  or  marriage-rates  of  a 
country  may  be  regarded  as  weighted  means.  For,  treating  the 
rate  for  simplicity  as  a  fraction,  and  not  as  a  rate  per  1000  of  the 
population, 

,  p    ,  1  total  births 

Birth-rate  of  whole  country  =  — r  ,  n— — 

total  population 

_  ^(birth-rate  in  each  district  x  population  in  that  district) 

;S(population  of  each  district) 

i.e.  the  rate  for  the  whole  country  is  the  mean  of  the  rates  in  the 
different  districts,  weighting  each  in  proportion  to  its  population. 
We  use  the  weighted  and  unweighted  means  of  such  rates  as 
illustrations  in  §  17  below. 

16.  It  is  evident  that  any  weighted  mean  will  in  general  differ 
from  the  unweighted  mean  of  the  same  quantities,  and  it  is 
required  to  find  an  expression  for  this  difference.  If  r  be  the 
correlation  between  weights  and  variables,  o-,„  and  the  standard- 
deviations,  and  iv  the  mean  weight,  we  have  at  once 

^W.X)  =  N(3Lw  +  rcr^a,), 

whence  M'^M  +  ra^,  (15) 
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That  is  to  say,  if  the  weights  aud  variables  are  positively  correlated, 
the  weighted  mean  is  the  greater  ;  if  negatively,  the  less.  In  some 
cases  r  is  very  small,  and  then  weighting  makes  little  difference, 
but  in  others  the  difference  is  large  and  important,  r  having  a 
sensible  value  and  cr^a„/w  a  large  value, 

17.  The  difference  between  weighted  and  unweighted  means 
of  death-rates,  birth-rates  or  other  rates  on  the  population  in 
different  districts  is,  for  instance,  nearly  always  of  importance. 
Thus  we  have  the  following  figures  for  rates  of  pauperism 
(Jour.  Stat.  Soc,  vol.  lix.  (1896),  p.  349). 


January  1. 

Percentages  of  the  Population  in 
receipt  of  ReUef. 

Arithmetic  Mean 

of  Rates  in 
different  Districts. 

England  and 
Wales  as  a 
whole. 

1850 

6-51 

5-80 

1860 

5-20 

4-26 

1870 

5-45 

4-77 

1881 

3-68 

312 

1891 

3-29 

2-69 

In  this  case  the  weighted  mean  is  markedly  the  less,  and  the 
correlation  between  the  population  of  a  district  and  its  pauperism 
must  therefore  be  negative,  the  larger  (on  the  whole  urban)  dis- 
tricts having  the  lower  percentage  in  receipt  of  relief.  On  the 
other  hand,  for  the  decade  1881-90  the  average  birth-rate  for 
England  and  Wales  was  32*34  per  thousand,  the  arithmetic 
mean  of  the  rates  for  the  different  districts  30-34  only.  The 
weighted  mean  was  therefore  the  greater,  the  birth-rate  being 
higher  in  the  more  populous  (urban)  districts,  in  which  there  is 
a  greater  proportion  of  young  married  persons. 

For  the  year  1891  the  average  population  of  a  Poor-law  district 
was  found  to  be  roughly  45,900  and  the  standard-deviation  o-^ 
56,400  (populations  ranging  from  under  2000  to  over  half  a 
million).  The  standard-deviation  cr^  of  the  percentages  of  the 
population  in  receipt  of  relief  was  1*24.  We  have  therefore, 
for  the  correlation  between  pauperism  and  population, 

_    3-29  -2-69  459 
^"         TM  ^564 
=  -0.39. 
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For  the  birth-rate,  on  the  other  hand,  assuming  that  ajw 
is  approximately  the  same  for  the  decade  1881-90  as  in  1891, 
we  have,  cr,  being  4  08, 

32-34  -  30-34  459 
^"        4-08  "^564 
=  +  -40. 

The  closeness  of  the  numerical  values  of  r  in  the  two  cases  is, 
of  course,  accidental. 

18.  The  principle  of  weighting  finds  one  very  important 
application  in  the  treatment  of  such  rates  as  death-rates,  which 
are  largely  affected  by  the  age  and  sex-composition  of  the  popula- 
tion. Neglecting,  for  simplicity,  the  question  of  sex,  suppose  the 
numbers  of  deaths  are  noted  in  a  certain  district  for,  say,  the 
age-groups  0-,  10-,  20-,  etc.,  in  which  the  fractions  of  the  whole 
population  are  p^,  p^,  p^,  etc.,  where  2(p)  =  l.  Let  the  death- 
rates  for  the  corresponding  age-groups  be  c?q,  dj,  d<^y  etc.  Then 
the  ordinary  or  crude  death-rate  for  the  district  is 

D==^d.p)       ....  (16) 

For  some  other  district  taken  as  a  basis  of  comparison,  perhaps 
the  country  as  a  whole,  the  death-rates  and  fractions  of  the 
population  in  the  several  age-groups  may  be  8j  8^  83  ,  .  .  ,  tt^  tto 
TTg  .  .  .  ,  and  the  crude  death-rate 

A  =  %{S.7r)        ....  (17) 

Now  D  and  A  may  differ  either  because  the  (f  s  and  8's  differ 
or  because  the  p's  and  tt's  differ,  or  both.  It  may  happen  that 
really  both  districts  are  about  equally  healthy,  and  the  death- 
rates  approximately  the  same  for  all  age-classes,  but,  owing  to  a 
difference  of  weighting,  the  first  average  may  be  markedly  higher 
than  the  second,  or  vice  versd.  If  the  first  district  be  a  rural 
district  and  the  second  urban,  for  instance,  there  will  be  a  larger 
proportion  of  the  old  in  the  former,  and  it  may  possibly  have  a 
higher  crude  death-rate  that  the  second,  in  spite  of  lower  death- 
rates  in  every  class.  The  comparison  of  crude  death-rates  is 
therefore  liable  to  lead  to  erroneous  conclusions.  The  difiiculty 
may  be  got  over  by  averaging  the  age-class  death-rates  in  the 
district  not  with  the  weights  Pi  p^  Pz  •  •  '  '  given  by  its  own 
population,  but  with  the  weights,  ttj  tt,,  -rrg  .  .  .  .  given  by  the 
population  of  the  standard  district.  The  standardised  death  rate 
for  the  district  will  then  be 


D'  =  %{d.Tr)      .  .       .  (18) 
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and  D'  and  A  will  be  comparable  as  regards  age-distribution. 
There  is  obviously  no  difficulty  in  taking  sex  into  account  as  well 
as  age  if  necessary.  The  death-rates  must  be  noted  for  each  sex 
separately  in  every  age-class  and  averaged  with  a  system  of 
weights  based  on  the  standard  population.  The  method  is  also 
of  importance  for  comparing  death-rates  in  different  classes  of  the 
population,  e,g.  those  engaged  in  given  occupations,  as  well  as  in 
different  districts,  and  is  used  for  both  these  purposes  in  the 
Decennial  Supplements  to  the  Reports  of  the  Registrar  General 
for  England  and  Wales  (ref.  16). 

19.  Difficulty  may  arise  in  practical  cases  from  the  fact  that 
the  death-rates  d^d^d^  .  .  .  .  are  not  known  for  the  districts  or 
classes  which  it  is  desired  to  compare  with  the  standard  popula- 
tion, but  only  the  crude  rates  D  and  the  fractional  populations 
of  the  age-classes  Pi  p^  -  -  -  •  The  difficulty  may  be  partially 
obviated  (c/.  Chap.  IV.  §  9,  pp.  51-3),  by  forming  what  is 
termed  an  index  death  rate  A'  for  the  class  or  district,  A'  being 
given  by 

^'  =  %{h.p)      ....  (19) 

I.e.  the  rates  of  the  standard  population  averaged  with  the 
weights  of  the  district  population.  It  is  the  crude  death-rate 
that  there  would  be  in  the  district  if  the  rate  in  every  age- 
class  were  the  same  as  in  the  standard  population.  An 
approximate  standardised  death-rate  for  the  district  or  class  is 
then  given  by 

D"  =  Dx^,     ....  (20) 

D"  is  not  necessarily,  nor  generally,  the  same  as  D'.  It  can 
only  be  the  same  if 

%{d.p)- 

This  will  hold  good  if,  e.g.,  the  death-rates  in  the  standard 
population  and  the  district  stand  to  one  another  in  the  same 
ratio  in  all  age-classes,  i.e.  8Jd^  =  SJd^  =  8Jd.^  etc.  This  method 
of  standardisation  is  used  in  the  Annual  Summaries  of  the 
Registrar-General  for  England  and  Wales. 

Both  methods  of  standardisation— that  of  §  18  and  that  of  the 
present  section — are  of  great  and  growing  importance.  They  are 
obviously  applicable  to  other  rates  besides  death-rates,  e.g.  birth- 
rates (c/.  refs.  17,  18).  Further,  they  may  readily  be  extended 
into  quite  different  fields.  Thus  it  has  been  suggested  (ref.  19) 
that  standardised  average  heights  or  standardised  average  weights 
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of  the  children  in  different  schools  might  be  obtained  on  the 
basis  of  a  standard  school  population  of  given  age  and  sex 
composition,  or  indeed  of  given  composition  as  regards  hair  and 
eye-colour  as  well. 

20.  In  §§  14-17  we  have  dealt  only  with  the  theory  of 
the  weighted  arithmetic  mean,  but  it  should  be  noted  that 
any  form  of  average  can  be  weighted.  Thus  a  weighted  median 
can  be  formed  by  finding  the  value  of  the  variable  such  that 
the  sum  of  the  weights  of  lesser  values  is  equal  to  the  sum 
of  the  weights  of  greater  values.  A  weighted  mode  could  be 
formed  by  finding  the  value  of  the  variable  for  which  the  sum 
of  the  weights  was  greatest,  allowing  for  the  smoothing  of 
casual  fluctuations.  Similarly,  a  weighted  geometric  mean  could 
be  calculated  by  weighting  the  logarithms  of  every  value  of  the 
variable  before  taking  the  arithmetic  mean,  i.e. 
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EXERCISES. 

1.  Find  the  values  obtained  for  the  standard-deviations  in  Examples  ii. 
(p.  139)  and  iii.  (p.  141)  of  Chapter  VIII.  on  applying  Sheppard's  correction 
for  grouping. 

2.  Show  that  if  a  range  of  six  times  the  standard-deviation  covers  at  least  18 
class-intervals  {cf.  Chap.  VI.  §  5),  Sheppard's  correction  will  make  a  difference 
of  less  than  0*5  per  cent,  in  the  rough  value  of  the  standard-deviation. 

3.  (Data  from  the  decennial  supplements  to  the  Annual  Reports  of  the 
Registrar- General  for  England  and  Wales.)   The  following  particulars  are 
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found  for  36  small  registration  districts  in  which  the  number  of  births  in  a 
decade  ranged  between  1500  and  2500  : — 


Decade. 

Proportion  of  Male  Births 
per  1000  of  all  Births. 

Mean. 

Standard- 
deviation. 

1881-1890 
1891-1900 

Both  decades 

508-1 
508-4 

12-80 
10-37 

508-25 

11-65 

It  is  believed,  however,  that  a  great  part  of  the  observed  standard-deviation 
is  due  to  mere  **  fluctuations  of  sampling  "  of  no  real  significance. 

Given  that  the  correlation  between  the  proportions  of  male  births  in  a 
district  in  the  two  decades  is  +  0-36,  estimate  (1)  the  true  standard- deviation 
freed  from  such  fluctuations  of  sampling  ;  (2)  the  standard-deviation  of  fluctua- 
tions of  sampling,  i.e.  of  the  errors  produced  by  such  fluctuations  in  the  observed 
proportions  of  male  births, 

4.  (Data  from  Pearson,  ref.  11.)  The  coefiicieuts  of  variation  for  breadth, 
height,  and  length  of  certain  skulls  are  3*89,  3*50,  and  3*24  per  cent,  respec- 
tively. Find  the  "spurious  correlation"  between  the  breadth/length  and 
height/length  indices,  absolute  measures  being  combined  at  random  so  that 
they  are  uncorrected . 

5.  (Data  from  Boas,  communicated  to  Pearson :  cf.  Fawcett  and  Pearson, 
Proc.  Roy.  Soc,  vol.  Ixii.  p.  413.)  From  short  series  of  measurements  on 
American  Indians  the  mean  coeflacient  of  correlation  found  between  father  and 
sou,  and  father  and  daughter,  for  cephalic  index,  is  0-14  ;  between  mother  and 
son,  and  mother  and  daughter  0-33.  Assuming  these  coefficients  should  be 
the  same  if  it  were  not  for  the  looseness  of  family  relations,  find  the  proportion 
of  children  not  due  to  the  reputed  father. 

6.  Find  the  correlation  between  X^-i-  and  X2+  ;  X^,  Xo  and  X^  being 
uncorrelated. 

7.  Find  the  correlation  between  X^  and  aX^  +  bXo,  Xi  and  X^  being 
uncorrelated. 

8.  (Referring  to  illustration  iv.,  §  14,  Chap,  X.)    Use  the  answer  to 
question  7  to  estimate,  very  roughly,  the  correlation  that  would  be  found 
between  annual  movements  in  infantile  and  general  mortality  if  the  mortality 
of  those  under  and  over  1  year  of  age  were  uncorrelated.    Note  that- 
general  mortality  per  \     .  ,AAnu-4.i  births 

1000  of  population  /  =infantile  mortality  per  1000  births  x  ^^^^^^^.^^ 

+  deaths  over  one  year  per  1000  of  population, 

and  treat  the  ratio  of  births  to  population  as  if  it  were  constant  at  a  rough 
average  value,  say  0-033.  The  standard-deviation  of  annual  movements  in 
infantile  mortality  is  {loc.  cit.)  9-6,  and  that  of  annual  movements  in  mortality 
other  than  infantile  may  be  taken  as  sensibly  the  same  as  that  of  general 
mortality,  or  say  1  unit. 

9.  If  the  relation 
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holds  for  all  values  of  x^,  and  (which  are,  in  our  usual  notation, 
deviations  from  their  respective  arithmetic  means),  find  the  coiTelations 
between  Xj,  x^  and  x^  in  terms  of  their  standard-deviations  and  the  values  of 
a,  h  and  c. 

10.  What  is  the  effect  on  a  weighted  mean  of  errors  in  the  weights  or  the 
quantities  weighted,  such  errors  being  uncorrelated  with  each  other,  with  the 
weights,  or  with  the  variables — (1)  if  the  arithmetic  mean  values  of  the  errors 
are  zero  ;  (2)  if  the  arithmetic  mean  values  of  the  errors  are  not  zero  ? 

11.  Cf.  (Pearson,  "On  a  Generalised  Theory  of  Alternative  Inheritance," 
Phil.  Trans.,  vol.  cciii.,  A,  1904,  p.  53).  If  we  consider  the  correlation 
between  number  of  recessive  couplets  in  parent  and  in  offspring,  in  a 
Mendelian  ])opulation  breeding  at  random  (such  as  would  ultimately  result 
from  an  initial  cross  between  a  pure  dominant  and  a  pure  recessive),  the 
correlation  is  found  to  be  1/3  for  a  total  number  of  couplets  n.  If  7i  =  l,  the 
only  possible  numbers  of  recessive  couplets  are  0  and  1 ,  and  the  correlation 
table  between  parent  and  offspring  reduces  to  the  form 


Offspring, 

Parent. 

0 

1 

Total 

0 

5 

1 

6 

1 

1 

1 

2 

Total 

6 

2 

8 

Verify  the  correlation,  and  work  out  the  association  coefficient  Q. 

12.  {Cf.  the  above,  and  also  Snow,  Proc,  Boy.  Soc,  vol.  Ixxxiii.,  B,  1910, 
Table  III,,  p,  42,)  For  a  similar  population  the  correlation  between 
brothers,  assuming  a  practically  infinite  size  of  family,  is  5/12.    The  table  is 


Second 

First  Brother. 

Brother. 

0 

1 

Total. 

0 

41 

7 

48 

1 

7 

9 

16 

Total 

48 

16 

64 

Verify  the  correlation,  and  work  out  the  association  coefficient  Q. 

13.  Referring  to  the  notation  of  §  10,  show  that  we  have  the  following 
expressions  for  the  regressions  in  a  fourfold  table  : — 

0-1  _  N.S  _{AB)  {Afi) 

o-o_  NS      (AB)  _  (o^ 
'"o-i    {A){a)      (A)  (a)' 


Verify  on  the  tables  of  questions  11  and  12. 


CHAPTER  XII. 


PARTIAL  CORRELATION. 

1-2.  Introductory  explanation— 3.  Direct  deduction  of  the  formulae  for  two 
variables — 4.  Special  notation  for  the  general  case  :  generalised  re- 
gressions— 5.  Generalised  correlations — 6.  Generalised  deviations  and 
standard -deviations — 7-8.  Theorems  concerning  the  generalised  pro- 
duct-sums— 9.  Direct  interpretation  of  the  generalised  regressions — 
10-11.  Reduction  of  the  generalised  standard-deviation — 12.  Reduc- 
tion of  the  generalised  regression — 13.  Reduction  of  the  generalised 
correlation-coeflBcient — 14.  Arithmetical  work  :  Example  i.  :  Example 
ii. — 15.  Geometrical  representation  of  correlation  between  three 
variables  by  means  of  a  model — 16.  The  coeflBcient  of  ii-fold  correlation 
— 17.  Expression  of  regressions  and  correlations  of  lower  in  terms  of 
those  of  higher  order — 18.  Limiting  inequalities  between  the  values  of 
correlation-coefficients  necessary  for  consistence — 19.  Fallacies. 

1.  In  Chapters  IX. -XI.  the  theory  of  the  correlation-coefficient  for 
a  single  pair  of  variables  has  been  developed  and  its  applications 
illustrated.  But  in  the  case  of  statistics  of  attributes  we  found 
it  necessary  to  proceed  from  the  theory  of  simple  association  for 
a  single  pair  of  attributes  to  the  theory  of  association  for  several 
attributes,  in  order  to  be  able  to  deal  with  the  complex  causation 
characteristic  of  statistics ;  and  similarly  the  student  will  find  it 
impossible  to  advance  very  far  in  the  discussion  of  many  problems 
in  correlation  without  some  knowledge  of  the  theory  of  multiple 
correlation^  or  correlation  between  several  variables.  In  such  a 
problem  as  that  of  illustration  i.,  Chap.  X.,  for  instance,  it  might 
be  found  that  changes  in  pauperism  were  highly  correlated 
(positively)  with  changes  in  the  out-relief  ratio,  and  also  with 
changes  in  the  proportion  of  old  ;  and  the  question  might  arise  how 
far  the  first  correlation  was  due  merely  to  a  tendency  to  give  out- 
relief  more  freely  to  the  old  than  the  young,  i.e.  to  a  correlation 
between  changes  in  out-relief  and  changes  in  proportion  of  old. 
The  question  could  not  at  the  present  stage  be  answered  by  work- 
ing out  the  correlation-coefficient  between  the  last  pair  of  variables, 
for  we  have  as  yet  no  guide  as  to  how  far  a  correlation  between 
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the  variables  1  and  2  can  be  accounted  for  by  correlations 
between  1  and  3  and  2  and  3.  Again,  in  the  case  of  illustration  iii., 
Chap.  X.,  a  marked  positive  correlation  might  be  observed  between, 
say,  the  bulk  of  a  crop  and  the  rainfall  during  a  certain  period,  and 
practically  no  correlation  between  the  crop  and  the  accumulated 
temperature  during  the  same  period  ;  and  the  question  might  arise 
whether  the  last  result  might  not  he  due  merely  to  a  negative 
correlation  between  rain  and  accumulated  temperature,  the  crop 
being  favourably  affected  by  an  increase  of  accumulated  temper- 
ature if  other  things  were  equals  but  failing  as  a  rule  to  obtain  this 
benefit  owing  to  the  concomitant  deficiency  of  rain.  In  the  prob- 
lem of  inheritance  in  a  population,  the  corresponding  problem  is 
of  great  importance,  as  already  indicated  in  Chapter  IV.  It  is 
essential  for  the  discussion  of  possible  hypotheses  to  know  whether 
an  observed  correlation  between,  say,  grandson  and  grandparent 
can  or  cannot  be  accounted  for  solely  by  observed  correlations 
between  grandson  and  parent,  parent  and  grandparent. 

2.  Problems  of  this  type,  in  which  it  is  necessary  to  consider 
simultaneously  the  relations  between  at  least  three  variables,  and 
possibly  more,  may  be  treated  by  a  simple  and  natural  extension 
of  the  method  used  in  the  case  of  two  variables.  The  latter  case 
was  discussed  by  forming  linear  equations  between  the  two 
variables,  assigning  such  values  to  the  constants  as  to  make  the 
sum  of  the  squares  of  the  errors  of  estimate  as  low  as  possible : 
the  more  complicated  case  may  be  discussed  by  forming  linear 
equations  between  any  one  of  the  n  variables  involved,  taking 
each  in  turn,  and  the  n-\  others,  again  assigning  such  values  to 
the  constants  as  to  make  the  sum  of  the  squares  of  the  errors  of 
estimate  a  minimum.  If  the  variables  are  X^X^X^  ,  .  .  .  X„, 
the  equation  will  be  of  the  form 

=  a -f ^g.-Tg -f ^g.Xg  +  ....  -Vh^.X^. 

If  in  such  a  generalised  regression  or  characteristic  equation  we 
find  a  sensible  positive  value  for  any  one  coefficient  such  as  ^'o, 
we  know  that  there  must  be  a  positive  correlation  between  X^ 
and  Xo  that  cannot  be  accounted  for  by  mere  correlations  of  X^ 
and  X^  with  Xg,  X^,  or  X„,  for  the  eff'ects  of  changes  in  these 
variables  are  allowed  for  in  the  remaining  terms  on  the  right. 
The  magnitude  of  gives,  in  fact,  the  mean  change  in  X^ 
associated  with  a  unit  change  in  Xg  when  all  the  remaining 
variables  are  kept  constant.  The  correlation  between  X^  and 
Xg  indicated  by  may  be  termed  a  partial  correlation,  as 
corresponding  with  the  partial  association  of  Chapter  IV.,  and  it 
is  required  to  deduce  from  the  values  of  the  coefficients  h,  which 
may  be  termed  partial  regressions,  partial  coeflBcients  of  corre- 
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lation  giving  the  correlation  between  and  or  other  pair  of 
variables  wheri  the  remaining  variables  X^  ....  X^  are  kept 
constant,  or  when  changes  in  these  variables  are  corrected  or  allowed 
for,  so  far  as  this  may  be  done  with  a  linear  equation.  For  examples 
of  such  generalised  regression-equations  the  student  may  turn  to 
the  illustrations  worked  out  below  (pp.  239-247). 

3.  With  this  explanatory  introduction,  we  may  now  proceed  to 
the  algebraic  theory  of  such  generalised  regression-equations  and 
of  multiple  correlation  in  general.  It  will  first,  however,  be  as 
well  to  revert  briefly  to  the  case  of  two  variables.  In  Chapter  IX., 
to  obtain  the  greatest  possible  simplicity  of  treatment,  the  value 
of  the  coefficient  r^pja^a^  was  deduced  on  the  special  assump- 
tion that  the  means  of  all  arrays  were  strictly  collinear,  and  the 
meaning  of  the  coefficient  in  the  more  general  case  was  sub- 
sequently investigated.  Such  a  process  is  not  conveniently 
applicable  when  a  number  of  variables  are  to  be  taken  into 
account,  and  the  problem  has  to  be  faced  directly :  i.e.  required, 
to  determine  the  coeffi,cients  and  constant  term,  if  any,  in  a 
regression-equation,  so  as  to  make  the  sum  of  the  squares  of  the 
errors  of  estimate  a  minimum.  We  will  take  this  problem  first 
for  the  case  of  two  variables,  introducing  a  notation  that  can  be 
conveniently  adapted  to  more.  Let  us  take  the  arithmetic 
means  of  the  variables  as  origins  of  measurement,  and  let  x^, 
denote  deviations  of  the  two  variables  from  their  respective 
means.  Then  it  is  required  to  determine  and  b-^^ 
gression-equation 

^i  =  «i  +  ^2-^2      .       .       .       .  (ct) 

so  as  to  make  '%{x^  -  ctj  +  ^12-^2)^  associated  pairs  of 

deviations  and  x^,  the  least  possible.  Put  more  briefly,  if 
we  write 

i\^.s?.2=^(^i -0^1- +  ^12-^2)^     •       •       •  W 

so  that  S|  2  is  root-mean-square  value  of  the  errors  of  estimate 
in  using  regression-equation  {a)  (cf  Chap.  IX.  §  14),  it  is  required 
to  make  s^  c^  a  minimum.  Suppose  any  value  whatever  to  be 
assigned  to  bj^,  and  a  series  of  values  of  to  be  tried,  Sj.g  being 
calculated  for  each.  Evidently  s^^  would  be  very  large  for 
values  of  (Xj  that  erred  greatly  either  in  excess  or  defect  of  the 
best  value  (for  the  given  value  of  Jjg),  and  would  continuously 
decrease  as  this  best  value  was  approached  ;  the  value  of  s^  ^  could 
never  become  negative,  though  possibly,  but  exceptionally,  zero. 
If  therefore  the  values  of  Sj  2  were  plotted  to  the  values  of  a^  on 
a  diagram,  a  curve  would  be  obtained  more  or  less  like  that 
of  fig.  44.    The  best  value  of  a^,  for  which  s^^  attained  its 
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minimum  value,  say  o-j  gj  could  be  approximately  estimated  from 
such  a  diagram ;  but  it  can  be  calculated  with  much  more  exact- 
ness from  the  condition  that  if  a\  a'\  be  two  values  close  above 
and  beloiv  the  best,  the  coi'responding  values  of  2  equal.  Let 
aj  and  {a-^  +  8)  be  two  such  values.    Then  if 

when  8  is  very  small,  the  value  of  is  the  best  for  the  assigned 
value  of  612.  But,  evidently,  the  equation  gives,  neglecting 
the  term  in  8^^ 

%{x^  -     +  h^^.x^  =  0, 

that  is, 

ai  =  0 

whatever  the  value  of  6j2.     This  is  the  direct  proof  of  the 


 i  

Fig.  44. 

result  that  no  constant  term  need  be  introduced  on  the  right 
of  a  regression-equation  when  written  in  terms  of  deviations 
from  the  arithmetic  mean,  or  that  the  two  lines  of  regression 
must  pass  through  the  mean  (Chap.  IX.  §  10).  We  may 
therefore  omit  any  constant  term.  If,  now,  6^2  is  to  be  assigned 
the  best  value,  we  must  have,  by  similar  reasoning,  for  slightly 
differing  values,  Sjgj  ^12  + 

%{x,-b,^.x,f  =  %{x,-[b^,-^-h]x,f. 

That  is,  again  neglecting  terms  in  8^, 

5^2(^1- ^12-^2)  =  0  .       .  .(c) 

or,  breaking  up  the  sum, 

_%{x^_  _cri 
'^2-2(V)~'^^^cr2' 
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which  is  the  value  found  by  the  previous  indirect  method  of 
Chapter  IX.  From  the  fact  that  b^^  is  determined  so  as  to 
make  the  value  of         -  ^12^2)^  possible,  the  method 

of  determination  is  sometimes  called  the  method  of  least  squares. 
Evidently  all  the  remaining  results  of  Chapter  IX.  follow  from 
this,  and  notably  we  have  for  o-^.g)  the  minimum  value  of  Sj^* 
the  standard-deviation  of  errors  of  estimate 

<r,./  =  <r,=(l-r„2)    .        .        .  .(d) 

4.  Now  apply  the  same  method  to  the  regression-equation 
for  n  variables.  Writing  the  equation  in  terms  of  deviations, 
it  follows  from  reasoning  precisely  similar  to  that  given  above 
that  no  constant  term  need  be  entered  on  the  right-hand 
side.  For  the  partial  regression-coefficients  (the  coefficients  of 
the  ic's  on  the  right)  a  special  notation  will  be  used  in  order 
that  the  exact  position  of  each  coefficient  may  be  rendered  quite 
definite.  The  first  subscript  affixed  to  the  letter  b  (which  will 
always  be  used  to  denote  a  regression)  will  be  the  subscript  of 
the  X  on  the  left  (the  dependent  variable),  and  the  second  will 
be  the  subscript  of  the  x  to  which  it  is  attached ;  these  may 
be  called  the  primary  subscripts.  After  the  primary  subscripts, 
and  separated  from  them  by  a  point,  are  placed  the  subscripts 
of  all  the  remaining  variables  on  the  right-hand  side  as  secondary 
subscripts.  The  regression-equation  will  therefore  be  written 
in  the  form 

•^1  —  ^12.34  .  .  .  «  •  ^2       ^13.24  .  .  .  »i  •  ^3  +   •  •  •    "J"  ^ln.23  .  .  .  (n-1  '  •  (1) 

The  order  in  which  the  secondary  subscripts  are  written  is, 
it  should  be  noted,  quite  indifferent,  but  the  order  of  the 
primary  subscripts  is  material ;  e.g.  612.3 .  .  .  .  n  and  621.3 .  .  .  .  n 
denote  quite  distinct  coefficients,  x^  being  the  dependent  variable 
in  the  first  case  and  in  the  second.  A  coefficient  with  p 
secondary  subscripts  may  be  termed  a  regression  of  the  pth  order. 
The  regressions  6^2}  ^gj,  6^3,  632,  etc.,  in  the  case  of  two  variables 
may  be  regarded  as  of  order  zero,  and  may  be  termed  total  as 
distinct  from  partial  regressions. 

5.  In  the  case  of  two  variables,  the  correlation-coefficient  r^^ 
may  be  regarded  as  defined  by  the  equation 

n2  =  (^2-^2i)*- 
We  shall  generalise  this  equation  in  the  form 

^12.34 .  .  .  .  n  =  (^12.34  .  .  .  .  n  •  ^21.34 .  .  .  .  n)*     •       ♦  (2) 

This  is  at  present  a  pure  definition  of  a  new  symbol,  and  it 
remains  to  be  shown  that  r^^-zi ...   n  cciay  really  be  regarded  as, 
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and  possesses  all  the  properties  of,  a  correlation-coeffjcient ;  the 
name  may,  however,  be  applied  to  it,  pending  the  proof.  A 
correlation-coefiicient  with  p  secondary  subscripts  will  be  termed 
a  correlation  of  order  p.  Evidently,  in  the  case  of  a  correlation- 
coefficient,  the  order  in  which  both  primary  and  secondary 
subscripts  is  written  is  indifferent,  for  the  right-hand  side  of 
equation  (2)  is  unaltered  by  writing  2  for  1  and  1  for  2.  The 
correlations  r^^,  r^g,  etc.,  may  be  regarded  as  of  order  zero,  and 
spoken  of  as  total,  as  distinct  from  partial,  correlations. 

6.  If  the  regressions  612.34 ^13.24  .  .  .  .  w  etc.,  be  assigned  the 
"best"  values,  as  determined  by  the  method  of  least  squares,  the 
difference  between  the  actual  value  of  and  the  value  assigned 
by  the  right-hand  side  of  the  regression-equation  (1),  that  is,  the 
error  of  estimate,  will  be  denoted  by  iCi.js ,  ...  n't  ^.e.  as  a  defini- 
tion we  have 

^1.23  ..  .n  —  '^l~  ^12.34 , . .  n  •  ^2  ~  ^13.24  .  .  .  w  -^S  ~  . .  .  ~  ^ln.23  .  .  .  (n-1)  •  •  (3) 

where  assigned  any  one  set  of  observed  values. 

Such  an  error  (or  residual,  as  it  is  sometimes  called)  denoted  by  a 
symbol  with  p  secondary  suffixes,  \\\\\  be  termed  a  deviation  of  the 
pth  order.  Finally,  we  will  define  a  generalised  standard-deviation 
o'i.23  ....  n  by  the  equation 

^.(r?.,3....„  =  :S(423....n)        ...  (4) 

N  being,  as  usual,  the  number  of  observations.  A  standard- 
deviation  denoted  by  a  symbol  with  p  secondary  suffixes  will  be 
termed  a  standard-deviation  of  the  pih.  order,  the  standard- 
deviations  o-j  o-g,  etc.,  being  regarded  as  of  order  zero,  the  standard- 
deviations  o-j  g  \  ®t^-'  W  §  ^)  ^^'^t  order,  and 
so  on. 

7.  From  the  reasoning  of  §  3  it  follows  that  the  "  least-square  " 
values  of  the  partial  regressions  ^'12.34  etc.,  will  be  given  by 
equations  of  the  form 

^{^\  ~  ^12.34  .  .  .  .  n  •  ^2  "l"    •    •    •    •        ^ln.23  ....  (n-1)  •  ^7i)^ 
=  %{X^  -  (612.34  ....«  +  8)^2  +...•+  ^ln.23  (n-1)  •  ^n)^ 

8  being  very  small.    That  is,  neglecting  the  term  in  8^, 

2^2(^1  ~  ^12.34     .  .  .  «  •  ^2  +    •    •    •    •    +  ^l«-23  ....  (n-1)  •  ^n)  =  0, 

or,  more  briefly,  in  terms  of  the  notation  of  equation  (3), 

2(^2-^1.23....  n)  =  0.  .  .  .  (5) 

There  are  a  large  number  of  these  equations,  {n-  1)  for  determin- 
ing the  coefficients  612.34 etc.,  (n-  1)  again  for  determining 
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the  coefficients  ^21.34  .  .  .  .  n>  ^tc,  and  so  on  :  they  are  sometimes 
termed  the  normal  equations.  If  the  student  will  follow  the  pro- 
cess by  which  (5)  was  obtained,  he  will  see  that  when  the  con- 
dition is  expressed  that  b^^M  .  .  .  .  n  shall  possess  the  "least-square  " 
value,  enters  into  the  product-sum  with  x^^z  .  .  .  .  n]  when  the 
same  condition  is  expressed  for  oc^  enters  into  the 

product-sum,  and  so  on.  Taking  each  regression  in  turn,  in  fact, 
every  x  the  suffix  of  which  is  included  in  the  secondary  suffixes 

^1.23  .  .  .  .  n  enters  into  the  product-sum.  The  normal  equations 
of  the  form  (5)  are  therefore  equivalent  to  the  theorem — 

The  product-sum  of  any  deviation  of  order  zero  with  any  deviation 
of  higher  order  is  zero,  provided  the  subscript  of  the  former  occur 
among  the  secondary  subscripts  of  the  latter. 

8.  But  it  follows  from  this  that 

S(a5i.34  . . .  „  .  a;2.34  . . .  «)       =  2X1.34  .  .  .  nix^  -  h23A  .  .  .  n  •  iCg  -  .  .  .  -  /^2n.34  .  .  .  (n  -I)  •  ^n) 
=  2(a;i.34  . . .  n  .  X2). 

Similarly, 

2(a;i.34 .  .  .  n  .  a;9.34  .  . .  n)      =  '^{x-^ .  3:2.34 .  . .  n). 

Similarly  again, 

S(Xi.34  . .  .  n  .  a;2.34  .  .  .  (n-1))  =  2(a;i.34  .  .  .  n  .  ^2), 

and  so  on.    Therefore,  quite  generally, 

^(^1.34  .  .  .  .  n  •  ^2.34  .  .  .  .  n)  =  ^(^1.34  ....  {n-1)  '  ^2.34  .  .  .  .  n) 

—  ^(^1  •  ^2.34  .  .  .  .  n) 

—  ■^(•^1.34  =  .  .  .  n  •  ^'2.34  ....  (n-1)) 

=  2(a;i  34  X2) 

Comparing  all  the  equal  product-sums  that  may  be  obtained 
in  this  way,  we  see  that  the  product-sum  of  any  trvo  deviations  is 
unaltered  by  omitting  any  or  all  of  the  secondary  subscripts  of  either 
v)hich  are  common  to  the  two,  and,  conversely,  tlie  product-sum  of  any 
deviation  of  order  p  with  a  deviation  of  order  p  -f  q,  the  p  subscripts 
being  the  same  in  each  case,  is  unaltered  by  adding  to  the  secondary 
subscripts  of  the  former  any  or  all  of  the  q  additional  subscripts  of 
the  latter. 

It  follows  therefore  from  (5)  that  any  product-sum  is  zero  if  all 
the  subscripts  of  the  one  deviation  occur  among  the  secondary  sub- 
scripts of  the  other.  As  the  simplest  case,  we  may  note  that  x^  is 
uncorrelated  with  x^^,  and  x^  uncorrclated  with  x^  ^^. 

The  theorems  of  this  and  of  the  preceding  paragraph  are  of 
fundamental  importance,  and  should  be  carefully  remembered. 
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9.  We  have  now  from  §§  7  and  8 — 

0  =  2(a:2.34  ......  ...,„) 

2a;2.34  .  .  .  .  n  (•'^1  -  ^12.34   .  .  .  n  •  -^^  ~"  terms  in     to  x^) 

=  ^(X'j  .  .^2.34  ,  .  .  .  n)  ~  ^h2.Z4  .  .  .  .  »i  ^(=^2  '  "^2.34  .  .  .  .  n) 

—  ^(''^1.34  .  .  .  .  n  •  -^2.34  .  .     .  n)  ~  ^12.34  .  .  .  .  n  ^(^2.34  .  .  .  .  n)- 

That  is 

A  —  ^(^1-34  .  .  .  .  n  •  ^2.34  .  .  .  ■  n) 

But  this  is  the  value  that  would  have  been  obtained  by  taking  a 
regression-equation  of  the  form 

•^1.34  .  .  .  .  n  —  ^12.34  .  .  .  .  n  •  ^2.34  .  .  .  .  n 

and  determining  612.34  .  .  .  .  n  the  method  of  least- squares,  i.e. 
^12.34 ....  n  is  the  regression  of  x^^^  _  ,  .  „  on  0:2.34  ....„•  It  follows 
at  once  from  (2)  that  ^13.34 .  .  .  .  „  is  the  correlation  between 
^1.34 .  .  .  .  n  and  X2,si  and  from  (4)  that  we  may  write 

^12.34  ....  fi  — M2.34  .  .  .  .  n   .  ■  {o) 

0"2.34  .  .  .  .  n 

an  equation  identical  with  the  familiar  relation  2  ■^^12-^1/^2' 
with  the  secondary  suffixes  34  ....  ti  added  throughout. 

To  illustrate  the  meaning  of  the  equation  by  the  simplest  case, 
if  we  had  three  variables  only,  x^,  x^,  and  x^,  the  value  of  b^^.s 
^12-3  could  be  determined  (1)  by  finding  the  correlations  r^g  and 
and  the  corresponding  regressions  6^3  and  633 ;  (2)  working  out 
the  residuals  -  b-^^.x^  and  x^  -  ^23*^3  all  associated  deviations  ; 
(3)  working  out  the  correlation  between  the  residuals  associated 
with  the  same  values  of  x^.  The  method  would  not,  however,  be 
a  practical  one,  as  the  arithmetic  would  be  extremely  lengthy, 
much  more  lengthy  than  the  method  given  below  for  expressing 
a  correlation  of  order  p  in  terms  of  correlations  of  order  p-1. 

10.  Any  standard-deviation  of  order  may  be  expressed  in  terms 
of  a  standard-deviation  of  order  jo  -  1  and  a  correlation  of  order  -  1, 
For, 

^(■^1.23  ...n)  —  2(^1.23  .  .  .  (n-1)  •  ^1.23  .  .  .  n) 

=  ^(•^1.23 .  .  .  (n-l))(^l  ~  ^ln.23 .  . .  {n-l)^n  ~  tCrmS  iu  X2  tO  X^^^) 
=  ^(^1.23  .  . .  (n-1))  ~  ^ln.23  .  .  .  (n-1)  ^(^1.23 ...  (n-1)  •  ^n.23  .  .  .  (n-1)) 

or,  dividing  through  by  the  number  of  observations, 

<^.2S  ....  n=  ....  (n-l)(l  ~  ^ln.23  ....  (n-1)  •  ^nl.2S  ....  (n-1)) 

^1.23     .  .  .  (n-l)(l  ~  ^ln.23  ....  (n-1))    •  •  • 
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This  is  again  the  relation  of  the  familiar  form — 
<rL  =  <r?(l-,l) 

with  the  secondary  suffixes  23  .  .  .  .      -  1)  added  throughout. 

It  is  clear  from  (9)  that  rj^gs  (m-dj  like  any  correlation  of  order 

zero,  cannot  be  numerically  greater  than  unity.    It  also  follows 
at  once  that  if  we  have  been  estimating     from  x^,      .  .  .  . 

Xn  will  not  increase  the  accuracy  of  estimate  unless  rin.23  (n-i) 

(not  differ  from  zero.  This  condition  is  somewhat  interesting, 
as  it  leads  to  rather  unexpected  results.  For  example,  if  r^^  =  +  0'8, 
rj3=  +0*4,  r23=  +0*5,  it  will  not  be  possible  to  estimate  x-^^  with 
any  greater  accuracy  from  arg  and  x^  than  from  x^  alone,  for  the 
value  of  rj3.2  is  zero  (see  below,  §  13). 

11.  It  should  be  noted  that,  in  equation  (9),  any  other  subscript 
can  be  eliminated  in  the  same  way  as  subscript  n  from  the  suffix  of 

o-j  23  n)  so  that  a  standard-deviation  of  order  p  can  be  expressed 

in  p  ways  in  terms  of  standard-deviations  of  the  next  lower  order. 
This  is  useful  as  affording  an  independent  check  on  arithmetic. 

Further,  0-1.23  (n-i)  can  be  expressed  in  the  same  way  in  terms 

of  o^Lzs        (n-2))        so  ou,  SO  that  wc  must  have 

"f.a...«=<'!(l -ri.)(l-ri,,,){l-ri,,^) .  .  .  (1  -»1„.^,..,„.„)  .  (10) 

This  is  an  extremely  convenient  expression  for  arithmetical  use  ; 
the  arithmetic  can  again  be  subjected  to  an  absolute  check  by 
eliminating  the  subscripts  in  a  different,  say  the  inverse,  order. 
Apart  from  the  algebraic  proof,  it  is  obvious  that  the  values  must 
be  identical ;  for  if  we  are  estimating  one  variable  from  n  others,  it 
is  clearly  indifferent  in  what  order  the  latter  are  taken  into  account. 

12.  Any  regression  of  order  p  may  be  expressed  in  terms  of 
regressions  of  order  j9  -  1.    For  we  have 

2(a;i.34  .  , .  n .  Xi.Si  ...«)  =  S(aJi,34  . .  .  (n-l)  .  X2M  ...n) 

=  ^XiM  .  .  .  (n-l)(a52  -  *2n.34  .  .  .  (n-1)  •       "  teims  in       to  Xn-\) 

—  2(a;i.34  . . .  (n-1)  .^2.34  . . .  (n-1))  -  &2n.34  .  .  .  («-l)S(x'i.34  . . .  („-l|.  a'n.34. . .  (n-1)). 

Replacing  h.^^_^  . . .        by  h^._^  . . .       .  <t\^^  . .  (u-dI^^Im  . . .  (n-i), 
we  have 

^'12.34..  .  n.  0'2.34  .  .  .  n  =  6i2.34  . .  .  (n-1)  -  ^2.34  .  .  .  (n-1)  "  ^ln.34  .  .  .  (n-1).  ^«2.34     .  (n-1)  -  ^2.34  .  . .  (n-1), 

or,  from  (9), 

;  _  ^12.34  ....  (n-1)  ~  ^ln.34  ....  (n-1)  •  ^«2.34  .  ,     .  (n-1)  |^^\ 

t'12.34  .  .  .  .  n  -  Y^Ih  h   ^  ' 

^       ^2n.34  ....  (n-1)  •  ^nl.U  ....  (n— 1) 

The  student  should  note  that  this  is  an  expression  of  the  form 


2n  '  ^nl 
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with  the  subscripts  34  ...  .  (?i  -  1)  added  throughout.  The 
coefficient  by,^i .  .  .  .  „  may  therefore  be  regarded  as  determined 
from  a  regression-equation  of  the  form 

^1.34  .  .  .    (n-1)  =  ^12.34  .  .  .  n  •  ^2.34  .  .  .  (n-1)  +  ^ln.23  .  .  .  (n-1)  •  ^n.34  .  .  ,  (n-l)> 

i.e.  it  is  the  partial  regression  of  x^_^^  _  _        on  0:2.34 .... 
^n.34  ....  (n-1)  being  given.    As  any  other  secondary  suffix  might 
have  been  eliminated  in  lieu  of  n,  we  might  also  regard  it  as 
the  partial  regression  of  x^  i^ .    .  .  „  on  0^2.45  ....„,  ^^3.46  .  .  .  .  n  being 
given,  and  so  on. 

13.  From  equation  (11)  we  may  readily  obtain  a  corresponding 
equation  for  correlations.    For  (11)  may  be  written 

7  _  ^12.34  ....  (n-1)  ~  ^lw.34  .  .  ,  .  (w-1)  ♦  ^2n.34  ....  (n-1)  O"  1.34  ....  (n-1). 

^12.34      .    n  —  1  ~2 

i       ''^2>/.34  ....  (n-1)  ^^2.34  ...  .  (n-1) 

Hence,  writing  down  the  corresponding  expression  for  621.34 .  .  .  .  „ 
and  taking  the  square  root 


(w-1)  •  ^2n.34  ....  (w-1) 


•  12.34  n  n  _  ^2  \i  /I       ^2  U 

K'-       '  ln.34  .  .  .  .  (w-l)7  '  2n.34  .  .  .  .  (n-1); 

This  is,  similarly,  the  expression  for  three  variables 


(12) 


with  the  secondary  subscripts  added  throughout,  and  ri2.34 . . . .  „ 
can  be  assigned  interpretations  corresponding  to  those  of  ^12.34 . . . ,  „ 
above.  Evidently  equation  (12)  permits  of  an  absolute  check  ot 
the  arithmetic  in  the  calculation  of  all  partial  coefficients  of  ac 
order  higher  than  the  first,  for  any  one  of  the  secondary  suffixes 
of  r,, 

,24 ....  n  be  eliminated  so  as  to  obtain  another  equation  of 
the  same  form  as  (12),  and  the  value  obtained  for  ri.,.34  . . .  „  by 
inserting  the  values  of  the  coefficients  of  lower  order  in  the 
expression  on  the  right  must  be  the  same  in  each  case. 

14.  The  equations  now  obtained  provide  all  that  is  necessary 
for  the  arithmetical  solution  of  problems  in  multiple  correlation. 
The  best  mode  of  procedure  on  the  whole,  having  calculated  all 
the  correlations  and  standard-deviations  of  order  zero,  is  (1)  to 
calculate  the  correlations  of  higher  order  by  successive  applications 
of  equation  (12) ;  (2)  to  calculate  any  required  standard-deviations 
by  equation  (10) ;  (3)  to  calculate  any  required  regressions  by 
equation  (8):  the  use  of  equation  (11)  for  calculating  the 
regressions  of  successive  orders  directly  from  each  other  is  com- 
paratively clumsy.    We  will  give  two  illustrations,  the  first  for 


XII.— PARTIAL  CORRELATION. 


three  and  the  second  for  four  variables.  The  introduction  of 
more  variables  does  not  involve  any  difference  in  the  form  of  the 
arithmetic,  but  rapidly  increases  the  amount. 

Example  i. — The  first  illustration  we  shall  take  will  be  a 
continuation  of  example  i.  of  Chapter  IX.,  in  which  the  correla- 
tion was  worked  out  between  (1)  the  average  earnings  of  agri- 
cultural labourers  and  (2)  the  percentage  of  the  population  in 
receipt  of  Poor-law  relief  in  a  group  of  38  rural  districts.  In 
Question  2  of  the  same  chapter  are  given  (3)  the  ratios  of  the 
numbers  in  receipt  of  outdoor  relief  to  the  numbers  relieved  in  the 
workhouse,  in  the  same  districts.  Required  to  work  out  the  partial 
correlations,  regressions,  etc.,  for  these  three  variables. 

Using  as  our  notation  —  average  earnings,  =  percentage  of 
population  in  receipt  of  relief,  X^  =  out-relief  ratio,  the  first  constants 
determined  are — 

-  15-9  shillings  <ri  =  1-71  shillings  r^^  =  -  0-66 
if.,  =  3  67  per  cent.  0-2  =  1 '29  per  cent.  r^.j=^-0-13 
Ml  =^  5-19  0-3  =  3-09  r,3-+0-60 

To  obtain  the  partial  correlations,  equation  (12)  is  used  direct  in 
its  simplest  form — 

The  work  is  best  done  systematically  and  the  results  collected 
in  tabular  form,  especially  if  logarithms  are  used,  as  many  of  the 
logarithms  occur  repeatedly.  First  it  will  be  noted  that  the 
logarithms  of  (1  - -J'^)*  occur  in  all  the  denominators;  these  had, 
accordingly,  better  be  worked  out  at  once  and  tabulated  (col.  2  of 
the  table  below).    In  col.  3  the  product  term  of  the  numerator  of 


1. 

2. 

3. 

4. 

6. 

6. 

7. 

8. 

r. 

log  a/'i  -  r2. 

Product 
Term. 

Numera- 
tor. 

log 
Num. 

log 
Deuom. 

Correlation  of 
First  Order. 

log  x/r-  r2 

log. 

Value. 

ri2=  -0-66 
ri3=  -0  13 
r23  =  -(-0-60 

1-87580 
1-99629 
1-90309 

-  0-0780 
-0-3960 
+0-0858 

-0-5820 
+  0-2660 
+0-5142 

1-76492 
1-42488 
1-71113 

T -89938 
T -77889 
1-87209 

1-86554 
1-64599 
1-83904 

rio.3-0-73 
ri3-2+0-44 
»--23-l+0"69 

T -83216 
1-95267 
T -85946 

each  partial  coefficient  is  entered,  i.e.  the  product  of  the  two  other 
coefficients  on  the  remaining  lines  in  col.  1  ;  subtracting  this  from 
the  coefficient  on  the  same  line  in  col.  1  we  have  the  numerator(col. 
4)  and  can  enter  its  logarithm.  The  logarithm  of  the  denominator 
(col.  6)  is  obtained  at  once  by  adding  the  two  logarithms  of  (1  -  r^^* 
on  the  remaining  lines  of  the  tablo,  and  subtracting  the  logarithms 
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of  the  denominators  from  those  of  the  numerators  we  have  the 
logarithms  of  the  correlations  of  the  first-order.  It  is  also  as  well 
to  calculate  at  once,  for  reference  in  the  calculation  of  standard- 
deviations  of  the  second-order,  the  values  of  logx/l-r^for  the 
first-order  coefficients  (col.  9). 

Having  obtained  the  correlations  we  can  now  proceed  to  the 
regressions.  If  we  wish  to  find  all  the  regression-equations,  we 
shall  have  six  regressions  to  calculate  from  equations  of  the  form 

^12-3  ~  ^12-3  •  ^lJ^2-Z- 

These  will  involve  all  the  six  standard-deviations  of  the  first 
order  crj.g)  o-j.g,  (t^.^  o-g.g,  etc.  But  the  standard-deviations  of 
the  first-order  are  not  in  themselves  of  much  interest,  and  the 
standard-deviations  of  the  second-order  are  so,  as  being  the 
standard-errors  or  root-mean-square  errors  of  estimate  made  in 
using  the  regression-equations  of  the  second-order.  We  may 
save  needless  arithmetic,  therefore,  by  replacing  the  standard- 
deviations  of  the  first-order  by  those  of  the  second,  omitting  the 
former  entirely,  and  transforming  the  above  equation  for  b^^.^ 
to  the  form 

^12-3  ^  **12.3  •  ^123/^21 3- 

This  transformation  is  a  useful  one  and  should  be  noted  by  the 
student.  The  values  of  each  or  may  be  calculated  twice  inde 
pendently  by  the  formulae  of  the  form 

<^i.23  =  <^i(l  -^12)*  (1  -»'13.2)* 
=  cri(l-ra*  (l-r?,3)^ 

so  as  to  check  the  arithmetic ;  the  work  is  rapidly  done  if  the 
values  of  log  J\  -  r'^  have  been  tabulated.    The  values  found  are 

log  0-123  =  0*06146  <''l23^ -^'-^^ 

log  0-2 13  =  1  -84584  0-2 13  =  0-70 

log  0-3.12  =  0-34571  0-3.12  =  2-22 

From  these  and  the  logarithms  of  the  r's  we  have 

log  6i23  =  0  08116,  61,3=  -  1-21  :  log  6i32  =  T-36174,  613.0=  -f-0-23 
log  ^>2i 3  =  1-64993,  62^3=  -0-45  :  log  Z^gsi  =  1'33917,  ^'031=  +0-22 
log  631.2  =  1-93024,  631.2=  +^'85  :  log  632.1  =  0-33891,  632.1=  +2-18 

That  is,  the  regression-equations  are 

(1)  -  1-21     -f  0-23  arg 

(2)  -0-45  0^1 -f  0-22 

(3)  x^=  -f  0-85  a;i  +  2-18  x^ 
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or,  transferring  the  origins  to  zero, 

(1)  Earnings  +  19-0  -  1-21  + 

(2)  Pawperism       X^=^  +  9-55  - 0*45  X^  +  0-22  X^ 

(3)  Out-relief  ratio  X.^  =  -  15-7  +  0-85  X^-\-'2,'lS  X^ 

The  units  are  throughout  one  shilling  for  the  earnings  X^,  1 
per  cent,  for  the  pauperism       and  1  for  the  out-relief  ratio  X^. 

The  first  and  second  regression-equations  are  those  of  most 
practical  importance.  The  argument  has  been  advanced  that 
the  giving  of  out-relief  tends  to  lower  earnings,  and  the  total 
coefficient  (rj3=-0*13)  between  earnings  (Xj)  and  out-relief 
(X3),  though  very  small  (c/.  Chap.  IX.  §  17),  does  not  seem 
inconsistent  with  such  a  hypothesis.  The  partial  correlation 
coefficient  (^^'13.2=  +0'44)  and  the  regression-equation  (1),  how- 
ever, indicate  that  in  unions  with  a  given  percentage  of  the 
population  in  receipt  of  relief  (Xo)  the  earnings  are  highest  where 
the  proportion  of  out-relief  is  highest ;  and  this  is,  in  so  far, 
against  the  hypothesis  of  a  tendency  to  lower  wages.  It  remains 
possible,  of  course,  that  out-relief  may  adversely  affect  the  possibil- 
ity of  earning,  e.g.  by  limiting  the  employment  of  the  old.  As 
regards  pauperism,  the  argument  might  be  advanced  that  the 
observed  correlation  (^23=  -l-0*60)  between  pauperism  and  out- 
relief  was  in  part  due  to  the  negative  correlation  (^^3=  -0'13) 
between  earnings  and  out-relief.  Such  a  hypothesis  would  have 
little  to  support  it  in  view  of  the  smallness  and  doubtful  signifi- 
cance of  rj3,  and  is  definitely  contradicted  by  the  positive  partial 
correlation  r^^_^  =  -t-  0*69,  and  the  second  regression-equation.  The 
third  regression-equation  shows  that  the  proportion  of  out-relief  is 
on  the  whole  highest  where  earnings  are  highest  and  pauperism 
greatest.  It  should  be  noticed,  however,  that  a  negative  ratio  is 
clearly  impossible,  and  consequently  the  relation  cannot  be  strictly 
linear ;  but  the  third  equation  gives  possible  (positive)  average 
ratios  for  all  the  combinations  of  pauperism  and  earnings  that 
actually  occur. 

Example  ii. — (Four  variables.)  As  an  illustration  of  the  form 
of  the  work  in  the  case  of  four  variables,  we  will  take  a  portion 
of  the  data  from  another  investigation  into  the  causation  of 
pauperism,  viz.  that  described  in  the  first  illustration  of  Chapter  X., 
to  which  the  student  should  refer  for  details.  The  variables  are 
the  ratios  of  the  values  in  1891  to  the  values  in  1881  (taken  as 
100)  of— 

1.  The  percentage  of  the  population  in  receipt  of  relief, 

2.  The  ratio  of  the  numbers  given  outdoor  relief  to  the  numbers 
relieved  in  the  workhouse, 

3.  The  percentage  of  the  population  over  65  years  of  age, 

16 
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4.  The  population  itself, 
in  the  metropolitan  group  of  32  unions,  and  the  fundamental 
constants  (means,  standard-deviations  and  correlations)  are  as 
follows : — 

Table  I. 


1. 

Means. 

2. 

Standard - 
deviations. 

3. 

Correlation- 
coeflBcient. 

4. 

log  sjY^"^. 

1 

104-7 

1 

29-2 

12 

+  0-52 

1-93154 

2 

90-6 

2 

41-7 

13 

+  0-41 

1-96003 

3 

107-7 

3 

5-5 

14 

-0-14 

1-99570 

4 

111-3 

4 

23-8 

23 

+  0-49 

1-94038 

24 

-1-0-23 

1-98820 

34 

+  0-25 

1-98598 

It  is  seen  that  the  average  changes  are  not  great;  the  per- 
centages of  the  population  in  receipt  of  relief  have  increased  on 
an  average  by  4-7  per  cent.,  the  out-relief  ratio  has  dropped  by 
9-4  per  cent.,  and  the  percentage  of  old  has  increased  by  7*7 
per  cent.,  at  the  same  time  as  the  population  of  the  unions  has 
risen  on  the  average  by  113  per  cent.  At  the  same  time  the 
standard-deviations  of  the  first,  second,  and  fourth  variables  are 
very  large.  As  a  matter  of  fact,  while  in  one  union  the 
pauperism  decreased  by  nearly  50  per  cent,  and  in  others  by 
20  per  cent.,  in  some  there  were  increases  of  60,  80,  and  90 
per  cent. ;  similarly,  in  the  case  of  the  out-relief,  in  several  unions 
the  ratio  was  decreased  by  40  to  60  per  cent.,  a  consistent 
anti-out-relief  policy  having  been  enforced ;  in  others  the  ratio 
was  doubled,  and  more  than  doubled.  As  regards  population, 
the  more  central  districts  show  decreases  ranging  up  to  20  and 
25  per  cent.,  the  circumferential  districts  increases  of  45  to  80 
per  cent.  The  correlations  of  order  zero  are  not  large,  the 
changes  in  the  rate  of  pauperism  exhibiting  the  highest  correlation 
with  changes  in  the  out-relief  ratio,  slightly  less  with  changes 
in  the  proportion  of  old,  and  very  little  with*  changes  in 
population. 

The  correlations  of  the  second  order  are  obtained  in  two  steps. 
In  the  first  place,  the  six  coefficients  of  order  zero  are  grouped  in 
four  sets  of  three,  corresponding  to  the  four  sets  of  three  variables 
formed  by  omitting  each  one  of  the  four  variables  in  turn  (Table 
II.  col.  1).  Each  of  these  sets  of  three  coefficients  is  then 
treated  in  the  same  manner  as  in  the  last  example,  and  so  the 
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Table  II. 


1. 

2. 

3. 

4. 

0. 

Correlutioii- 

Product 

Correlation - 

coeflficient 

Term  of 

Numerator. 

coefficient 

losr  \/l  — 

(Zero  Order). 

Numerator. 

( First  Order). 

12 

+  0-52 

+  0-3191 

12-3 

+  0-4013 

i  i/UlO/ 

13 

+  0-41 

"T  U  loos 

13-2 

+  0-2084 

L  VVVOO 

23 

+  0-49 

+  0-2132 

+  0-2768 

23-1 

+  0-3553 

1-97070 

12 

+  0-52 

_  n  '^^99 
—  yj  xjo^-it 

T  V  Ui)^^ 

12-4 

+  0-5731 

L  aLoOO 

14 

-0-14 

—  U  SOV\J 

14-2 

-0-3123 

T*Q7779 

24 

+  0-23 

-0-0728 

+  0-3028 

24-1 

+  0-3580 

1-97022 

13 

+  0-41 

-0-0350 

+  0-4450 

13-4 

+  0-4642 

1-94731 

14 

-0-14 

+  0-1025 

-0-2425 

14-3 

-0-2746 

1-98297 

34 

+  0-25 

-0-0574 

+  0-3074 

34-1 

+  0  3404 

1-97326 

23 

+  0-49 

+  0-0575 

+  0-4325 

23-4 

+  0-4590 

1-94863 

24 

+  0-23 

+  0-1225 

+  0-1075 

24-3 

+  0-1274 

1-99645 

34 

+  0-25 

+  0-1127 

+  0-1373 

34-2 

+  0-1618 

1-99424 

Table  IH. 


1. 

2. 

3. 

Nunterator. 

4. 

5. 

Correlation- 
coefficient 
(First  Order). 

Product 
Term  of 
Numerator. 

Correlation- 
coefficient 
(Second  Order). 

log  \/\-  r^. 

12-  4 

13-  4 
23-4 

+  0-5731 
+  0-4642 
+  0-4590 

+  0-2131 
+  0-2631 
+  0-2660 

+  0-3600 
+  0-2011 
+  0-1930 

12-  34 

13-  24 
23  14 

+  0-457 
+  0-276 
+  0-266 

1-94901 
1-98277 
1-98408 

12-3 
14-3 
24-3 

+  0-4013 
-0-2746 
+  0-1274 

-0-0350 
+  0-0511 
-0-1102 

+  0-4363 
-0-3257 
+  0-2376 

12-34 
14-23 
24-13 

+  0-457 
-0-359 
+  0-270 

1-97013 

1-98359 

13-  2 

14-  2 
34-2 

+  0-2084 
-0-3123 
+  0-1618 

-0-0505 
+  0-0337 
-0-0651 

+  0-2589 
-0-3460 
+  0-2269 

13-  24 

14-  23 
3412 

+  0-276 
-0-359 
+  0-244 

T-98664 

23-  1 

24-  1 
34-1 

+  0-3553 
+  0-3580 
+  0-3404 

+  0-1219 
+  0-1209 
+  01272 

+  0-2334 
+  0-2371 
+  0-2132 

23-  14 

24-  13 
34-12 

+  0  266 
+  0-270 
+  0-244 
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correlations  of  the  first  order  (Table  II.  col.  4)  are  obtained. 
The  first-order  coefficients  are  then  regrouped  in  sets  of  three, 
with  the  same  secondary  suffix  (Table  III.  col.  1),  and  these 
are  treated  precisely  in  the  same  way  as  the  coefficients  of  order 
zero.  In  this  way,  it  will  be  seen,  the  value  of  each  coefficient 
of  the  second  order  is  arrived  at  in  two  ways  independently,  and 
so  the  arithmetic  is  checked :  ^^2.34  occurs  in  the  first  and  fourth 
lines,  for  instance,  r^g.^^  in  the  second  and  seventh,  and  so  on. 
Of  course  slight  differences  may  occur  in  the  last  digit  if  a 
sufficient  number  of  digits  is  not  retained,  and  for  this  reason  the 
intermediate  work  should  be  carried  to  a  greater  degree  of 
accuracy  than  is  necessary  in  the  final  result ;  thus  four  places 
of  decimals  were  retained  throughout  in  the  intermediate  work  of 
this  example,  and  three  in  the  final  result.  If  he  carries  out  an 
independent  calculation,  the  student  may  differ  slightly  from 
the  logarithms  given  in  this  and  the  following  work,  if  more  or 
fewer  figures  are  retained. 

Having  obtained  the  correlations,  the  regressions  can  be  calcu- 
lated from  the  third-order  standard-deviations  by  equations  of  the 
form  (as  in  the  last  example), 

''12-34        12-34  _.  t 
"^2-134 

80  the  standard-deviations  of  lower  orders  need  not  be  evaluated. 
Using  equations  of  the  form 

=<r,(i-n.)'(i-»i3..)'(i-'i.«)' 

we  find 

log  o-j  234  =  1-35740  o-j  234  =  22-8 

log  0-2.134=  1-50597  0-2.134  =  32-1 

log  0-3124  =  0-65773  0-3104  =  4-55 

log  0-4.123=1-32914  o-4.i;3  =  21-3 

All  the  twelve  regressions  of  the  second  order  can  be  readily 
calculated,  given  these  standard  deviations  and  the  correlations, 
but  we  may  confine  ourselves  to  the  equation  giving  the  changes 
in  pauperism  (X-^)  in  terms  of  other  variables  as  the  most  impor- 
tant.   It  will  be  found  to  be 

a?i  =  0'32^x^  +  1  -383a;3  -  0-383a:4, 

or,  transferring  the  origins  and  expressing  the  equation  in  terms  of 
percentage-ratios, 

Xi=  -311 -I- 0-325X2  + 1-383X3 -0-383X,, 
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6r,  again,  in  terms  of  percentage-changes  (ratio  -  100) :  — 
Percentage  change  in  pauperism 

=  4- 1'4  per  cent. 

+  0'325  times  the  change  in  out-relief  ratio. 
+  1'383        „  „     proportion  of  old. 

-  0-383  population. 

These  results  render  the  interpretation  of  the  total  coefficients, 
which  might  be  equally  consistent  with  several  hypotheses,  more 
clear  and  definite.  The  questions  would  arise,  for  instance, 
whether  the  correlation  of  changes  in  pauperism  with  changes  in 
out-relief  might  not  be  due  to  correlation  of  the  latter  with  the 
other  factors  introduced,  and  whether  the  negative  correlation  with 
changes  in  population  might  not  be  due  solely  to  the  correlation 
of  the  latter  with  changes  in  the  proportion  of  old.  As  a  matter 
of  fact,  the  partial  correlations  of  changes  in  pauperism  with 
changes  in  out-relief  and  in  proportion  of  old  are  slightly  less  than 
the  total  correlations,  but  the  partial  correlation  with  changes  in 
population  is  numerically  greater,  the  figures  being 

ri2=+0-52  ^12.34= +0-46 

ri3= -1-0-41  r,3.24=+0-28 
r,,=  -0-U  r,,.23=  -0-36 

So  far,  then,  as  we  have  taken  the  factors  of  the  case  into 
account,  there  appears  to  be  a  true  correlation  between  changes 
in  pauperism  and  changes  in  out-relief,  proportion  of  old,  and 
population — the  latter  serving,  of  course,  as  some  index  to 
changes  in  general  prosperity.  The  relative  influences  of  the 
three  factors  are  indicated  by  the  regression-equation  above. 
[For  the  full  discussion  of  the  case  cf.  Jo2ir.  Roy.  Stat.  Soc, 
vol.  Ixii.,  1899.] 

15.  The  correlation  betw^een  pauperism  and  labourers'  earnings 
exhibited  by  the  figures  of  Example  i.  was  illustrated  by  a  diagram 
(fig.  40,  p.  180),  in  which  scales  of  "pauperism"  and  "earnings" 
were  taken  along  two  axes  at  right  angles,  and  every  observed 
pair  of  values  was  entered  by  marking  the  corresponding  point 
with  a  small  circle :  the  diagram  was  completed  by  drawing  in 
the  lines  of  regression.  In  precisely  the  same  way  the  correlation 
between  three  variables  may  be  represented  by  a  model  showing  the 
distribution  of  points  in  space ;  for  any  set  of  observed  values  X^, 
Xg,  Xg  may  be  regarded  as  determining  a  point  in  space,  just  as 
any  pair  of  values  and  X^  may  be  regarded  as  determining  a 
point  in  a  plane.  Fig.  45  is  drawn  from  such  a  model,  constructed 
from  the  data  of  Example  i.    Four  pieces  of  wood  are  fixed  together 
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like  the  bottom  and  three  sides  of  a  box.  Supposing  the  open 
side  to  face  the  observer,  a  scale  of  pauperism  is  drawn  vertically 
upwards  along  the  left-hand  angle  at  the  back  of  the  "box,"  the 


B 


'  Fig.  45. — Model  illustrating  the  Correlation  between  three  Variables:  (1) 
Pauperism  (percentage  of  the  population  in  receipt  of  Poor-law  relief) ; 
(2)  Out-relief  ratio  (numbers  given  relief  in  their  homes  to  one  in  the 
workhouse);  (3)  Average  Weekly  Earnings  of  agricultural  labourers, 
(data  pp.  178  and  189).  A,  front  view  ;  .5,  view  of"  model  tilted  till  the 
])lane  of  regression  for  pauperism  on  the  two  remaining  variables  is  sceu 
as  a  straight  line. 
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scale  starting  from  zero,  as  very  small  values  of  pauperism  occur  : 
a  scale  of  out-relief  ratio  is  taken  along  the  angle  between  the 
back  and  bottom  of  the  box,  starting  from  zero  at  the  left :  finally, 
the  scale  of  earnings  is  drawn  out  towards  the  observer  along  the 
angle  between  the  left-hand  side  and  the  bottom,  but  as  earnings 
lower  than  1 2s.  do  not  occur,  the  scale  may  start  from  1 2s.  at  the 
corner.  Suitable  scales  are  :  pauperism,  1  in.  =  1  per  cent. ;  out- 
relief  ratio,  1  in.  =  1  unit;  earnings,  1  in.  =  Is. ;  and  the  inside 
measures  of  the  model  may  then  be  17  in.  x  10  in.  x  8  in.  high, 
the  dimensions  of  the  model  constructed.  Given  these  three 
scales,  any  set  of  observed  values  determine  a  point  within  the 
"box."  The  earnings  and  out-relief  ratio  for  some  one  union  are 
noted  first,  and  the  corresponding  point  marked  on  the  baseboard  ; 
a  steel  wire  is  then  inserted  vertically  in  the  base  at  this  point 
and  cut  off  at  the  height  corresponding,  on  the  scale  chosen,  to 
the  pauperism  in  the  same  union,  being  finally  capped  with  a 
small  ball  or  knob  to  mark  the  "point"  clearly.  The  model 
shows  very  well  the  general  tendency  of  the  pauperism  to  be  the 
higher  the  lower  the  wages  and  the  higher  the  out-relief,  for  the 
highest  points  lie  towards  the  back  and  right-hand  side  of  the 
model.  If  some  representation  of  all  three  equations  of  regression 
were  to  be  inserted  in  the  model,  the  result  would  be  rather 
confusing ;  so  the  most  important  equation,  viz.  the  second,  giving 
the  average  rate  of  pauperism  in  terms  of  the  other  variables,  may 
be  chosen.  This  equation  represents  a  plane  :  the  lines  in  which 
it  cuts  the  right- and  left-hand  sides  of  the  "box"  should  be 
marked,  holes  drilled  at  equal  intervals  on  these  lines  on  the 
opposite  sides  of  the  box  (the  holes  facing  each  other),  and  threads 
stretched  through  these  holes,  thus  outlining  the  plane  as  shown 
in  the  figure.  In  the  actual  model  the  correlation-diagrams  (like 
fig.  40)  corresponding  to  the  three  pairs  of  variables  were  drawn 
on  the  back  sides  and  base :  they  represent,  of  course,  the  eleva- 
tions and  plan  of  the  points. 

The  student  possessing  some  skill  in  handicraft  would  find  it 
worth  while  to  make  such  a  model  for  some  case  of  interest  to 
himself,  and  to  study  on  it  thoroughly  the  nature  of  the  plane  of 
regression,  and  the  relations  of  the  partial  and  total  correlations. 

16.  If  we  write 

oi.23  .  .  .  .  n  =  0-l(l  -  ^1(23  .  .  .  .  n))  •  •  •  (13) 

it  may  be  shown  that  R^^^z  .  .  .  .  n)  is  the  correlation  between 
and  the  expression  on  the  right-hand  side  of  the  regression- 
equation,  say    23  .  .  .  .  n)  where 

*1.23...»  =  ^12.34...n-^2"^^13.24...n-^3+  •••   +  .  .  .  (n-1)  •  ^^n     •  (14) 
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For  we  have 

2(^1  .  «1.23  n)  =  -  ^1.23  .  .  .  .  m)  =  ^(o"?  "  (tIk     .  .  ,  n) 

and  also 

2(e?.23  ,>)  =  -  ^1.23  .  .  .  ,         =  -^Xo-I  -  0"5.23  «) 

whence  the  correlation  between     and  61,23  ...   n  is 

(o-l  -  0-1.23  .  .  .  .  tt)^ 

t.c.  the  value  of  i?i(23 ....«)  given  by  (13).  The  value  of  E  is 
accordingly  a  useful  datum  as  indicating  how  closely  can 
be  expressed  in  terms  of  a  linear  function  of  x^  .  .  .  .  x^,  and 
the  values  of  the  regressions  may  be  regarded  as  determined 
by  the  condition  that  R  shall  be  a  maximum.  Its  value  is 
essentially  positive  as  the  product-sum  ^{xx.e^.^-i ....  n)  is  positive. 
R  maybe  termed  a  coefficient  of  (7i-l)-fold  (or  double,  triple, 
etc.)  correlation ;  for  n  variables  there  are  n  such  correlations, 
but  in  the  limiting  case  of  two  variables  the  two  are  identical. 
The  value  may  be  readily  calculated,  either  from  0-1.23  .  .  .  .  n  and 
o-j  or  directly  from  the  equation 

1  -  i2?,23.  .  .  n)  =  (1  -  -  ^3.2)(1  "  ^23)  •  •  •  (1  "  rf„.23 .  .  .  (n-l,)-  (15) 

It  is  obvious  from  this  equation  that  since  every  bracket  on 
the  right  is  not  greater  than  unity, 

l-i??,23....n,>>l-»'?2. 

Hence  ^1(23  ....„>  cannot  be  numerically  less  than  r^^.  For  the 
same  reason,  rewriting  (15)  in  every  possible  form,  .ff^js .  . .  m 
cannot  be  numerically  less  than  rjg,  »'i3,  ....  ?'in,  i-^-  any  one 
of  the  possible  constituent  coefficients  of  order  zero.  Further, 
for  similar  reasons,  i?i(23  .,..„>  cannot  be  numerically  less  than 
any  possible  constituent  coefficient  of  any  higher  order.  That 
is  to  say,  i?i(23  ....«>  is  not  numerically  less  than  the  greatest 
of  all  the  possible  constituent  coefficients,  and  is  usually,  though 
not  always,  markedly  greater.  Thus  in  Example  i.,  -^2(13) 
(the  coefficient  of  double  correlation  between  pauperism  on 
the  one  hand,  out-relief  and  labourers'  earnings  on  the  other) 
is  0  839,  and  the  numerically  greatest  of  the  possible  constituent 
coefficients  is  ^^12.3=  —0*73.  Again,  in  Example  11.,  -ill  (234)  ^S 
0'626,  and  the  numerically  greatest  of  the  possible  constituent 
coefficients  is  «'i2.4=  -fO'573. 

The  student  should  notice  that  R  is  necessarily  positive. 
Further,  even  if  all  the  variables  Xj,  Xg,  ....  were  strictly 
uncorrelated  in  the  original  universe  as  a  whole,  we  should  expect 
''12'  ^13  2'  ^14  23'  ®*'®-»     exhibit  values  (whether  positive  or  negative") 
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differing  from  zero  in  a  limited  sample.  Hence,  R  will  not 
tend,  on  an  average  of  such  samples,  to  be  zero,  but  will 
fluctuate  round  some  mean  value.  This  mean  value  will 
be  the  greater  the  smaller  the  number  of  observations  in  the 
sample,  and  also  the  greater  the  number  of  variables.  When 
only  a  small  number  of  observations  are  available  it  is, 
accordingly,  little  use  to  deal  with  a  large  number  of  variables. 
As  a  limiting  case,  it  is  evident  that  if  we  deal  with  n  variables 
and  possess  only  n  observations,  all  the  partial  correlations 
of  the  highest  possible  order  will  be  unity. 

17.  It  is  obvious  that  as  equations  (11)  and  (12)  enable  us  to 
express  regressions  and  correlations  of  higher  orders  in  terms  of 
those  of  lower  orders,  we  must  similarly  be  able  to  express  the 
coefficients  of  lower  in  terms  of  those  of  higher  orders.  Such 
expressions  are  sometimes  useful  for  theoretical  work.  Using  the 
same  method  of  expansion  as  in  previous  cases,  we  have 

0  =  2)(a;i  23  .  .  .  .  n  •  ^2.34  ....  (n-l)) 

=  S(a;j  .  372.34     .  ,  .  (n-l))  ~  ^12.34  .  .  .  .  n  ^(•^g  '  ''^2.34  ....  (n-l)) 

~  ^l»t.23  .  .  ..(«-!>  ^(^n  •  ^2.34  ....  (n-l)) 

That  is, 

^12.34  ....  (n-l)  =  ^12.34  .  .  .  .  n  +  ^ln.23  ....  (n-l)  •  ^n2.34  ....  (n-l)' 

In  this  equation  the  coefficient  on  the  left  and  the  last  on  the 
right  are  of  order  ti-  -  3,  the  other  two  of  order  n  -  2.  We  therefore 
wish  to  eliminate  the  last  coefficient  on  the  right.  Interchanging 
the  suffixes  1  for  n  ana  n  for  1,  we  have 


^n2.34  ....  (n-l)  —  ^n2.13  ....  (n-l)  •  +  ^nl.23  ....   n-l)  •  ^12.34  . 


(«-!)• 


Substituting  this  value  for  hn2-^  ....  (^-x)  in  the  first  equation  we 
have 

z  _  ^12.34  .  .  .  .  n  +  ^ln.23  ....  (n-l)  •  ^n2.13  ....  (n-l)  /, 

O12.34  ....  (n-l)  -  T^TT  7  .  (lb) 

^ln.23  ....  (n-l)  •  Onl.23  ....  (n-l) 

This  is  the  required  equation  for  the  regressions ;  it  is  the  equation 

I     _  ^12.n  +  ^ln.2  •  ^n2.1 
A      t^ln.2  •  ^nl.2 

with  secondary  suffixes  34  ....  (n-  1)  added  throughout.  The 
corresponding  equation  for  the  correlations  is  obtained  at  once 
by  writing  down  equation  (16)  for  621.34  ....  (n-i)  and  taking  the 
square  root  of  the  product  (c/.  §  13) ;  this  gives 

„  _  ^12.34  ....  n  +  ^ln.23  ....  (n-l)  •  '*2n.l3  ....  (n-l) 

^2.34  —       /I  _^2  wnrip  W  ' 

U      Mn.23  ....  (n-l))  \  ^      ^'in.lZ  ....  (n-l)^ 
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which  is  similarly  the  equation 

"  (r-»-U'(l-rL,)' 

with  the  secondary  suffixes  34  ....  (w-  1)  added  throughout. 

18.  Equations  (12)  and  (17)  imply  that  certain  limiting 
inequalities  must  hold  between  the  correlation-coefficients  in 
the  expression  on  the  right  in  each  case  in  order  that  real 
values  (values  between  ±1)  may  be  obtained  for  the  correlation- 
coefficient  on  the  left.  These  inequalities  correspond  precisely 
with  those  "conditions  of  consistence"  between  class-frequencies 
with  which  we  dealt  in  Chapter  XL,  but  we  propose  to  treat  them 
only  briefly  here.  Writing  (12)  in  its  simplest  form  for  r^^^, 
we  must  have  rl^^^  <  1  or 

(^12  ~  '^13  •  '^23)^  ^1 

that  is, 

rl  +  rls  +  rl,-2r,^r,,r,,<l     ,       ,        .  (18) 

if  the  three  r's  are  consistent  with  each  other.  If  we  take  r^g,  r^^ 
as  known,  this  gives  as  limits  for 

^2^3  ±  n/1  -     -     +  ^12^3. 

Similarly  writing  (17)  in  its  simplest  form  for  r^^        terms  of 
g,      2,  and  r^^-j^,  we  must  have 

^^2.3  +  ^^3.2  +  '^.l  +  2?'i2.3''l3.2^23.l 

<1       .      .  (19) 

and  therefore,  if  ^12.3  and  r^g.g  are  given,  rg^.j  must  lie  between 
the  limits 

~  ''l2.3''l3.2  d:   n/1  —  ^2.3      '^3.2  +  ^2.3'*1M.'« 

The  following  table  gives  the  limits  ot  the  third  coefficient  in 
a  few  special  cases,  for  the  three  coefficients  of  zero  order  and 
of  the  first  order  respectively  : — 


Value  of 

Limits  of 

ri2  or  ri2.3. 

ri3  or  ri3.2. 

»*23. 

^23.1. 

0 

0 

+  1 

+  1 

+  1 

+  1 

+  1 

-1 

±1 

Hhl 

-1 

+  1 

±V0-5 

±VOT> 

0,  +1 

0,  -1 

+  \/0-5 

+  \/0'5 

0,  -1 

0,  +1 
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The  student  should  notice  that  the  set  of  three  coefficients  of 
order  zero  and  value  unity  are  only  consistent  if  either  one  only, 
or  all  three,  are  positive,  i.e.  +1,  +1,  +1,  or  -1,  -1,  +1;  but 
not  -1,-1,-1.  On  the  other  hand,  the  set  of  three  coefficients 
of  the  first  order  and  value  unity  are  only  consistent  if  one  only, 
or  all  three,  are  negative  :  the  only  consistent  sets  are  +1,  +1, 
-  1  and  -  1,  -  1,  -  1.  The  values  of  the  two  given  r's  need  to 
be  very  high  if  even  the  sign  of  the  third  can  be  inferred  ;  if  the 
two  are  equal,  they  must  be  at  least  equal  to  JO'b  or  '707  .  .  .  . 
Finally,  it  may  be  noted  that  no  two  values  for  the  known 
coefficients  ever  permit  an  inference  of  the  value  zero  for  the 
third ;  the  fact  that  1  and  2,  1  and  3  are  uncorrelated,  pair  and 
pair,  permits  no  inference  of  any  kind  as  to  the  correlation 
between  2  and  3,  which  may  lie  anywhere  between  + 1  and  -  1. 

19.  We  do  not  think  it  necessary  to  add  to  this  chapter  a 
detailed  discussion  of  the  nature  of  fallacies  on  which  the  theory 
of  multiple  correlation  throws  much  light.  The  general  nature  of 
such  fallacies  is  the  same  as  for  the  case  of  attributes,  and  was 
discussed  fully  in  Chap.  IV.  §§  1-8.  It  suffices  to  point  out  the 
principal  sources  of  fallacy  which  are  suggested  at  once  by  the 
form  of  the  partial  correlation 


12-8 


N/(l-'f3)(l-^3) 


(«) 


and  from  the  form  of  the  corresponding  expression  for  r^^  in  terms 
of  the  partial  coefficients 


23-1 


V(i-'-k2)(i-»-k.) 


(*) 


From  the  form  of  the  numerator  of  {a)  it  is  evident  (1)  that  even 
if  be  zero,  r^g.g  will  not  be  zero  unless  either  r^g  or  r23,  or 
both,  are  zero.  If  r^g  and  r^^  are  of  the  same  sign  the  partial 
correlation  will  be  negative ;  if  of  opposite  sign,  positive.  Thus 
the  quantity  of  a  crop  might  appear  to  be  unaff'ected,  say,  by 
the  amount  of  rainfall  during  some  period  preceding  harvest : 
this  might  be  due  merely  to  a  correlation  between  rain  and  low 
temperature,  the  partial  correlation  between  crop  and  rainfall 
being  positive  and  important.  We  may  thus  easily  misinterpret 
a  coefficient  of  correlation  which  is  zero.  (2)  7-^2.3  b^,  indeed 
often  is,  of  opposite  sign  to  rj2,  and  this  may  lead  to  still  more 
serious  errors  of  interpretation. 

From  the  form  of  the  numerator  of  (6),  on  the  other  hand,  we 
see  that,  conversely,  will  not  be  zero  even  though  r^^-z  zero, 
unless  either  r^g  2  or  rgg.j  is  zero.    This  corresponds  to  the  theorem 
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of  Chap.  IV.  §  6,  and  indicates  a  source  of  fallacies  similar  to 
those  there  discussed. 

20.  We  have  seen  (§  9)  that  r^^.z  the  correlation  between  ^ 
and  X2,2,  and  that  we  might  determine  the  value  of  this  partial 
correlation  by  drawing  up  the  actual  correlation  table  for  the  two 
residuals  in  question.  Suppose,  however,  that  instead  of  drawing 
up  a  single  table  we  drew  up  a  series  of  tables  for  values  of  x^  ^ 
and  iTja  associated  with  values  of  lying  within  successive 
class-intervals  of  its  range.  In  general  the  value  of  r^^^  would 
not  be  the  same  (or  approximately  the  same)  for  all  such  tables, 
but  would  exhibit  some  systematic  change  as  the  value  of 
increased.  Hence  r^^^-z  should  be  regarded,  in  general,  as  of  the 
nature  of  an  average  correlation :  the  cases  in  wdiich  it  measures 
the  correlation  between  x^_^  and  iCg.s  every  value  of  x^  {cf. 
Chap.  XVI.)  are  probably  exceptional.  The  process  for  deter- 
mining partial  associations  {cf.  Chap.  IV.)  is,  it  will  be  remembered, 
thorough  and  complete,  as  we  always  obtain  the  actual  tables 
exhibiting  the  association  between,  say,  A  and  B  in  the  universe 
of  C"s  and  the  universe  of  y's :  that  these  two  associations  may 
differ  materially,  is  illustrated  by  Example  i.  of  Chap.  IV. 
(pp.  45-6).  It  might  sometimes  serve  as  a  useful  check  on 
partial-correlation  work  to  reclassify  the  observations  by  the 
fundamental  methods  of  that  chapter.  For  the  general  case  an 
extension  of  the  method  of  the  "  correlation-ratio  "  (Chap.  X.,  §  20) 
might  be  useful,  though  exceedingly  laborious.  It  is  actually 
employed  in  the  paper  cited  in  ref.  7  and  the  theory  more  fully 
developed  in  ref.  8. 
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EXERCISES. 

1.  (Ref.  10.)  The  following  means,  standard-deviations,  and  correlations  are 
found  for 

Jfj  =  seed-hay  crops  in  cwts.  per  acre, 

=  spring  rainfall  in  inches, 
Xg  =  accumulated  temperature  above  42°  F.  in  spring, 
in  a  certain  district  of  England  during  20  years. 

3fi  =  28'02  0-1  =  4-42  ri2=+0-80 

M.2=  A-dl  (r2=l-10  ri3=-0-40 

Ms  =  59i  0-3  =  85  r23=-0-56 

Find  the  partial  correlations  and  the  regression-equation  for  hay-crop  on  spring 
rainfall  and  accumulated  temperature. 

2.  (The  following  figures  must  be  taken  as  an  illustration  only  :  the  data 
on  which  they  were  based  do  not  refer  to  uniform  times  or  areas.) 

=  deaths  of  infants  under  1  year  per  1000  births  in  same  year  (infantile 
mortality). 

^2  =  proportion  per  thousand  of  married  women  occupied  for  gain. 
^3  =  death -rate  of  persons  over  5  years  of  age  per  10,000. 
-3^4  =  proportion  per  thousand  of  population  living  2  or  more  to  a  room 
(overcrowding). 

Taking  the  figures  below  for  30  urban  areas  in  England  and  Wales,  find  the 
partial  correlations  and  the  regression-equation  for  infantile  mortality  on  the 
other  factors. 

iJ/i  =  164  0-1=  20  0  ri2=+0-49  r.23=+0-15 
Mr,  =  n8  0-2=  74-9  7-13= +0-78  ro4=-0-37 
if3  =  143  0-3=  22-4  7-14= +0-20  r34=-f0'23 
1/4=205         0-4  =  130-0 

3.  If  all  the  correlations  of  order  zero  are  equal,  say  =  r,  what  are  the  values 
of  the  partial  correlations  of  successive  orders  ? 

Under  the  same  condition,  what  is  the  limiting  value  of  r  if  all  the  equal 
correlations  are  negative  and  n  variables  have  been  observed  ? 

4.  What  is  the  correlation  between  a;i.2  and  x^.^  ? 

5.  Write  down  from  inspection  the  values  of  the  partial  correlations  for  the 
three  variables 

Zi,  X2,  and  X3  =  a.X^  +  b.X<^. 

Check  the  answer  to  Qu.  7,  Chap.  XL,  by  working  out  the  partial 
correlations. 
'    6.  If  the  relation 

a.a?!  +  b.X2  +  c.X2  =  0 

holds  for  all  sets  of  values  of  a^,  x^,  and  x^,  what  must  the  partial  correlations 
be? 

Check  the  answer  to  Qu.  9,  Chap.  XL,  by  working  out  the  partial 
correlations. 


PART  III.-THEORY  OF  SAMPLING. 


CHAPTER  XIII. 
SIMPLE  SAMPLING  OF  ATTRIBUTES. 

1.  The  problem  of  the  present  Part — 2.  The  two  chief  divisions  of  the  theory 
of  sampling — 3.  Limitation  of  the  discussion  to  the  case  of  simple 
sampling — 4.  Definition  of  the  chance  of  success  or  failure  of  a  given 
event— 5.  Determination  of  the  mean  and  standard-deviation  of  the 
number  of  successes  in  n  events — 6.  The  same  for  the  proportion  of 
successes  in  n  events  :  the  standard-deviation  of  simple  sampling  as  a 
measure  of  unreliability,  or  its  reciprocal  as  a  measure  of  precision — 7. 
Verification  of  the  theoretical  results  by  experiment — 8.  More  detailed 
discussion  of  the  assumptions  on  which  the  formula  for  the  standard- 
deviation  of  simple  sampling  is  based — 9-10.  Biological  cases  to 
which  the  theory  is  directly  applicable — 11.  Standard-deviation  of 
simple  sampling  when  the  numbers  of  observations  in  the  samples 
vary — 12.  Approximate  value  of  the  standard-deviation  of  simple 
sampling,  and  relation  between  mean  and  standard-deviation,  when 
the  chance  of  success  or  failure  is  very  small — 13.  Use  of  the  standard- 
deviation  of  simple  sampling,  or  standard  error,  for  checking  and 
controlling  the  interpretation  of  statistical  results. 

1.  On  several  occasions  in  the  preceding  chapters  it  has  been 
pointed  out  that  small  differences  between  statistical  measures  like 
percentages,  averages,  measures  of  dispersion  and  so  forth  cannot 
in  general  be  assumed  to  indicate  the  action  of  definite  and  assign- 
able causes.  Small  differences  may  easily  arise  from  indefinite 
and  highly  complex  causation  such  as  determines  the  fluctuating 
proportions  of  heads  and  tails  in  tossing  a  coin,  of  black  balls  in 
drawing  samples  from  a  bag  containing  a  mixture  of  black  and 
white  balls,  or  of  cards  bearing  measurements  within  some  given 
class-interval  in  drawing  cards,  say,  from  an  anthropometric  record. 
In  100  throws  of  a  coin,  for  example,  we  may  have  noted  56  heads 
and  only  44  tails,  but  we  cannot  conclude  that  the  coin  is  biassed  : 
on  repeating  our  throws  we  may  get  only  48  heads  and  52  tails. 
Similarly,  if  on  measuring  the  statures  of  1000  men  in  each  of 
two  nations  we  find  that  the  mean  stature  is  slightly  greater  for 
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nation  A  than  for  nation  B,  we  cannot  necessarily  conclude  that 
the  real  mean  stature  is  greater  in  the  case  of  nation  A  :  possibly 
if  the  observations  were  repeated  on  different  samples  of  1000 
men  the  ratio  might  be  reversed. 

2.  The  theory  of  such  fluctuations  may  be  termed  the  theory 
of  sampling,  and  there  are  two  chief  sections  of  the  theory  corre- 
sponding to  the  theory  of  attributes  and  the  theory  of  variables 
respectively.  In  tossing  a  coin  we  only  classify  the  results  of  the 
tosses  as  heads  or  tails ;  in  drawing  balls  from  a  mixture  of  black 
and  white  balls,  we  only  classify  the  balls  drawn  as  black  or  as 
white.  These  cases  correspond  to  the  theory  of  attributes,  and 
the  general  case  may  be  represented  as  the  drawing  of  a  sample 
from  a  universe  containing  both  ^'s  and  a's,  the  number  or 
proportion  of  ^'s  in  successive  samples  being  observed.  If,  on  the 
other  hand,  we  put  in  a  bag  a  number  of  cards  bearing  different 
values  of  some  variable  X  and  draw  sample  batches  of  cards,  we 
can  form  averages  and  measures  of  dispersion  for  the  successive 
batches,  and  these  averages  and  measures  of  dispersion  will  vary 
slightly  from  one  batch  to  another.  If  associated  measures  of 
two  variables  X  and  Y  are  recorded  on  each  card,  we  can  also  form 
correlation-coefficients  for  the  different  batches,  and  these  will  vary 
in  a  similar  manner.  These  cases  correspond  to  the  theory  of 
variables,  and  it  is  the  function  of  the  theory  of  sampling  for  such 
cases  to  inform  us  as  to  the  fluctuations  to  be  expected  in  the 
averages,  measures  of  dispersion,  correlation-coefficients,  etc.,  in 
successive  samples.  In  the  present  and  the  three  following 
chapters  the  theory  of  sampling  is  dealt  with  for  the  case  of 
attributes  alone.  The  theory  is  of  great  importance  and  interest, 
not  only  from  its  applications  to  the  checking  and  control  of 
statistical  results,  but  also  from  the  theoretical  forms  of  frequency- 
distribution  to  which  it  leads.  Finally,  in  Chapter  XVII.  one  or 
two  of  the  more  important  cases  of  the  theory  of  sampling  for 
variables  are  briefly  treated,  the  greater  part  of  the  theory,  owing 
to  its  difficulty,  lying  somewhat  outside  the  limits  of  this  work. 

3.  The  theory  of  sampling  attains  its  greatest  simplicity  if 
every  observation  contributed  to  the  sample  may  be  regarded  as 
independent  of  every  other.  This  condition  of  independence 
holds  good,  e.g.,  for  the  tossing  of  a  coin  or  the  throwing  of  a  die  : 
the  result  of  any  one  throw  or  toss  does  not  affect,  and  is  un- 
affected by,  the  results  of  the  preceding  and  following  tosses. 
It  does  not  hold  good,  on  the  other  hand,  for  the  drawing  of  balls 
from  a  bag :  if  a  ball  be  drawn  from  a  bag  containing  3  black 
and  3  white  balls,  the  remainder  may  be  either  2  black  and  3 
white,  or  2  white  and  3  black,  according  as  the  first  ball  was 
black  or  white.    The  result  of  drawing  a  second  ball  is  therefore 
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dependent  on  the  result  of  drawing  the  first.  The  disturbance 
can  only  be  eliminated  by  drawing  from  a  bag  containing  a 
number  of  balls  that  is  infinitely  large  compared  with  the 
total  number  drawn,  or  by  returning  each  ball  to  the  bag  before 
drawing  the  next.  In  this  chapter  our  attention  will  be  confined 
to  the  case  of  independent  sampling,  as  in  coin-tossing  or  dice- 
throwing — the  simplest  cases  of  an  artificial  kind  suitable  for 
theoretical  study  and  experimental  verification.  For  brevity,  we 
may  refer  to  such  cases  of  sampling  as  simple  sampling  :  the 
implied  conditions  are  discussed  more  fully  in  §  8  below. 

4.  If  we  may  regard  an  ideal  coin  as  a  uniform,  homogeneous 
circular  disc,  there  is  nothing  which  can  make  it  tend  to  fall  more 
often  on  the  one  side  than  on  the  other ;  we  may  expect,  there- 
fore, that  in  any  long  series  of  throws  the  coin  will  fall  with 
either  face  uppermost  an  approximately  equal  number  of  times, 
or  with,  say,  heads  uppermost  approximately  half  the  times. 
Similarly,  if  we  may  regard  the  ideal  die  as  a  perfect  homogeneous 
cube,  it  will  tend,  in  any  long  series  of  throws,  to  fall  with  each 
of  its  six  faces  uppermost  an  approximately  equal  number  of 
times,  or  with  any  given  face  uppermost  one-sixth  of  the  whole 
number  of  times.  These  results  are  sometimes  expressed  by 
saying  that  the  chance  of  throwing  heads  (or  tails)  with  a  coin  is 
1/2,  and  the  chance  of  throwing  six  (or  any  other  face)  with  a  die 
is  1/6.  To  avoid  speaking  of  such  particular  instances  as  coins 
or  dice,  we  shall  in  future,  using  terms  which  have  become 
conventional,  refer  to  an  event  the  chance  of  success  of  which  is  p 
and  the  chance  of  failure  q.    Obviously  p  +  q  =  \. 

5.  Suppose  we  take  samples  with  n  events  in  each.  What 
will  be  the  values  towards  which  the  mean  and  standard-deviation 
of  the  number  of  successes  in  a  sample  will  tend  1  The  mean  is 
given  at  once,  for  there  are  N.n  events,  of  which  approximately 
pNn  will  be  successes,  and  the  mean  number  of  successes  in  a 
sample  will  therefore  tend  towards  p7i.  As  regards  the  standard- 
deviation,  consider  first  the  single  event  {n  =  \).  The  single 
event  may  give  either  no  successes  or  one  success,  and  will  tend 
to  give  the  former  qN,  the  latter  pN,  times  in  N  trials.  Take 
this  frequency-distribution  and  work  out  the  standard-deviation 
of  the  number  of  successes  for  the  single  event,  as  in  the  case  of 
an  arithmetical  example  : — 


Frequency  /. 
qN 
pN 


Successes  |. 
0 

1 


ft- 


pN 


pN 


pN 


pN 
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We  have  therefore  M=p,  and 

But  the  number  of  successes  in  a  group  of  n  such  events  is  the 
sum  of  successes  for  the  single  events  of  which  it  is  composed, 
and,  all  the  events  being  independent,  we  have  therefore,  by  the 
usual  rule  for  the  standard-deviation  of  the  sum  of  independent 
variables  (Chap.  XI.  §  2,  equation  (2)),  o-„  being  the  standard- 
deviation  of  the  number  of  successes  in  n  events, 

o^«  =  W  (1) 

This  is  an  equation  of  fundamental  importance  in  the  theory  of 
sampling.  The  student  should  particularly  bear  in  mind  that 
the  standard-deviation  of  the  number  of  successes,  due  to 
fluctuations  of  simple  sampling  alone,  in  a  group  of  n  events 
varies,  not  directly  as     but  as  the  square  root  of  n. 

6.  In  lieu  of  recording  the  absolute  number  of  successes  in  each 
sample  of  n  events,  we  might  have  recorded  the  proportion  of 
such  successes,  i.e.  l/?ith  of  the  number  in  each  sample.  As  this 
would  amount  to  merely  dividing  all  the  figures  of  the  original 
record  by  n,  the  mean  proportion  of  successes — or  rather  the  value 
towards  which  the  mean  tends  to  approach — must  be  p,  and  the 
standard-deviation  of  the  proportion  of  successes  s„  be  given  by 

>l  =  ^l,V=pqln    .       .       .       .  (2) 

The  standard-deviation  of  the  proportion  of  successes  in  samples 
of  such  independent  events  varies  therefore  inversely  as  the  square 
root  of  the  number  on  which  the  proportion  is  calculated.  Now 
if  we  regard  the  observed  proportion  in  any  one  sample  as  a 
more  or  less  unreliable  determination  of  the  true  proportion  in 
a  very  large  sample  from  the  same  material,  the  standard-devia- 
tion of  sampling  may  fairly  be  taken  as  a  measure  of  the 
unreliability/  of  the  determination — the  greater  the  standard- 
deviation,  the  greater  the  fluctuations  of  the  observed  proportion, 
although  the  true  proportion  is  the  same  throughout.  The 
reciprocal  of  the  standard-deviation  (1/s),  on  the  other  hand,  may 
be  regarded  as  a  measure  of  reliability/,  or,  as  it  is  sometimes 
termed,  precision,  and  consequently  the  reliability  or  precision  of 
an  observed  proportion  varies  as  the  square  root  of  the  number  of 
observations  on  which  it  is  based.  This  is  again  a  very  important 
rule  with  many  practical  applications,  but  the  limitations  of  the 
case  to  which  it  applies,  and  the  exact  conditions  from  which  it 
has  been  deduced,  should  be  borne  in  mind.  We  return  to  this 
point  again  below  (§  8  and  Chap.  XfV.). 

7.  Experiments  in  coin  tossing,  dice  throwing,  and  so  forth 
have  been  carried  out  by  various  persons  in  order  to  obtain  ex- 
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perimental  verification  of  these  results.  The  following  will  serve 
as  illustrations,  but  the  student  is  strongly  recommended  to 
carry  out  a  few  series  of  such  experiments  personally,  in  order  to 
acquire  confidence  in  the  use  of  the  theory.  It  may  be  as  well 
to  remark  that  if  ordinary  commercial  dice  are  to  be  used  for  the 
trials,  care  should  be  taken  to  see  that  they  are  fairly  true  cubes, 
and  the  marks  not  cut  very  deeply.  Cheap  dice  are  generally 
very  much  out  of  truth,  and  if  the  marks  are  deeply  cut  the 
balance  of  the  die  may  be  sensibly  affected.  A  convenient  mode 
of  throwing  a  number  of  dice,  suggested,  we  believe,  by  the  late 
Professor  Weldon,  is  to  roll  them  down  an  inclined  gutter  of 
corrugated  paper,  so  that  they  roll  across  the  corrugations. 

(1)  (W.  F.  R.  Weldon,  cited  by  Professor  F.  Y.  Edgeworth, 
Encycl.  Brit.,  11th  edn.,  vol.  xxii.  p.  394.  Totals  of  the  columns 
in  the  table  there  given.) 

Twelve  dice  were  thrown  4096  times ;  a  throw  of  4,  5,  or  6  points 
reckoned  a  success,  therefore  p  =  q  =  0'b.  Theoretical  mean  M=  6  ; 
theoretical  value  of  the  standard-deviation  cr.^=  JO'5  x  0*5  x  12  = 
1-732. 

The  following  was  the  frequency-distribution  observed  : — 


esses. 

Frequency. 

0 

1 

7 

2 

60 

3 

198 

4 

430 

5 

731 

6 

948 

Successes. 

Frequency. 

7 

847 

8 

536 

9 

257 

10 

71 

11 

11 

12 

Total 

4096 

Mean  M=  6*139,  standard-deviation  cr=  1*712.  The  proportion  of 
successes  is  6*139/12  =  0*512  instead  of  0*5. 

(2)  (W.  F.  R.  Weldon,  loc.  cit.,  p.  400.  Totals  of  columns  of 
the  table  given.) 

Twelve  dice  were  thrown  4096  times  ;  only  a  throw  of  6  was 
counted  a  success,  sojo=l/6,  (/  =  5/6.  Theoretical  mean  M=2, 
standard-deviation  o- =  x  5/6  x  12  =  1*291. 

The  following  was  the  observed  frequency-distribution  : — 


Successes. 
0 

1 

2 
3 
4 


Frequency. 

447 
1145 
1181 

796 

380 


Successes. 
5 
6 
7 
8 


Frequency. 
115 
24 
7 
1 


Total  4096 
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Mean  M=  2  000,  standard-deviation  o-  =  1-296.  Actual  proportion 
of  successes  2-00/12  =  0-1667,  agreeing  with  the  theoretical  value 
to  the  fourth  place  of  decimals.  Of  course  such  very  close 
agreement  is  accidental,  and  not  to  be  always  expected. 

(3)  (G.  U.  Yule.)  The  following  may  be  taken  as  an  illustra- 
tion based  on  a  smaller  number  of  observations.  Three  dice  were 
thrown  648  times,  and  the  numbers  of  5's  or  6's  noted  at 
each  throw.  jD  =  l/3,  q  =  2/3.  Theoretical  mean  1.  Standard- 
deviation,  0-816. 

Frequency-distribution  observed : — 

Successes.  Frequency, 

0  179 

1  298 

2  141 

3  30 


Total  648 


M=  1-034,  o-  =  0-823.    Actual  proportion  of  successes  0-345. 

For  other  illustrations,  some  of  which  are  cited  in  the  questions 
at  the  end  of  this  chapter,  the  student  may  be  referred  to  the 
list  of  references  on  p.  273.  The  student  should  notice  that  in 
all  the  distributions  given  a  range  of  six  times  the  standard- 
deviation  includes  either  all,  or  the  great  bulk  of,  the  observations, 
as  in  most  frequency-distributions  of  the  same  general  form.  We 
shall  make  use  of  this  rule  below,  §  13. 

8.  In  deducing  the  formulae  (1)  and  (2)  for  the  standard- 
deviations  of  simple  sampling  in  the  cases  with  which  we  have 
been  dealing,  only  one  condition  has  been  explicitly  laid  down  as 
necessary,  viz.  the  independence  of  the  several  drawings,  tossings, 
or  other  events  composing  the  sample.  But  in  point  of  fact  this 
is  not  the  only  nor  the  most  fundamental  condition  which  has 
been  explicitly  or  implicitly  assumed,  and  it  is  necessary  to  realise 
all  the  conditions  in  order  to  grasp  the  limitations  under  which 
alone  the  formulae  arrived  at  will  hold.  Supposing,  for  example, 
that  we  observe  among  groups  of  1000  persons,  at  different  times 
or  in  different  localities,  various  percentages  of  individuals 
possessing  certain  characteristics— dark  hair,  or  blindness,  or 
insanity,  and  so  forth.  Under  what  conditions  should  we 
expect  the  observed  percentages  to  obey  the  law  of  sampling 
that  we  have  found,  and  show  a  standard-deviation  given  by 
equation  (2)? 

(a)  In  the  first  place  we  have  tacitly  assumed  throughout  the 
preceding  work  that  our  dice  or  our  coins  were  the  same  set  or 
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identically  similar  throughout  the  experiment,  so  that  the  chance 
of  throwing  "  heads  "  with  the  coins  or,  say,  "  six  "  with  the  dice 
was  the  same  throughout :  we  did  not  commence  an  experiment 
with  dice  loaded  in  one  way  and  later  on  take  a  fresh  set  of  dice 
loaded  in  another  way.  Consequently  if  formula  (2)  is  to  hold 
good  in  our  practical  case  of  sampling  there  must  not  be  a 
difference  in  any  essential  respect — i.e.  in  any  character  that  can 
affect  the  proportion  observed — between  the  localities  from  which 
the  observations  are  drawn,  nor,  if  the  observations  have  been 
made  at  different  epochs,  must  any  essential  change  have  taken 
place  during  the  period  over  which  the  observations  are  spread. 
Where  the  causation  of  the  character  observed  is  more  or  less 
unknown,  it  may,  of  course,  be  difficult  or  impossible  to  say  what 
differences  or  changes  are  to  be  regarded  as  essential,  but,  where 
we  have  more  knowledge,  the  condition  laid  down  enables  us  to 
exclude  certain  cases  at  once  from  the  possible  applications  of 
formula  (1)  or  (2).  Thus  it  is  obvious  that  the  theory  of  simple 
sampling  cannot  apply  to  the  variations  of  the  death-rate  in 
localities  with  populations  of  different  age  and  sex  compositions, 
nor  to  death-rates  in  a  mixture  of  healthy  and  unhealthy  districts, 
nor  to  death-rates  in  successive  years  during  a  period  of  con- 
tinuously improving  sanitation.  In  all  such  cases  variations 
due  to  definite  causes  are  superposed  on  the  fluctuations  of 
sampling. 

(h)  In  the  second  place,  we  have  also  tacitly  assumed  not 
only  that  we  were  using  the  same  set  of  coins  or  dice  throughout, 
so  that  the  chances  p  and  q  were  the  same  at  every  trial,  but 
also  that  all  the  coins  and  dice  in  the  set  used  were  identically 
similar,  so  that  the  chances  p  and  q  were  the  same  for  every  coin 
or  die.  Consequently,  if  our  formulae  are  to  apply  in  the  practical 
case  of  sampling,  the  conditions  that  regulate  the  appearance  of 
the  character  observed  must  not  only  be  the  same  for  every 
sample,  but  also  for  every  individual  in  every  sample.  This  is 
again  a  very  marked  limitation.  To  revert  to  the  case  of  death- 
rates,  formulae  (1)  and  (2)  would  not  apply  to  the  numbers  of 
persons  dying  in  a  series  of  samples  of  1000  persons,  even  if  these 
samples  were  all  of  the  same  age  and  sex  composition,  and  living 
under  the  same  sanitary  conditions,  unless,  further,  each  sample 
only  contained  persons  of  one  sex  and  one  age.  For  if  each 
sample  included  persons  of  both  sexes  and  different  ages,  the 
condition  would  be  broken,  the  chance  of  death  during  a  given 
period  not  being  the  same  for  the  two  sexes,  nor  for  the  young 
and  the  old.  The  groups  would  not  be  homogeneous  in  the  sense 
required  by  the  conditions  from  which  our  formulae  have  been 
deduced.    Similarly,  if  we  were  observing  hair-colours,  our  formulas 
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would  not  apply  if  the  samples  were  compounded  by  always 
taking  one  person  from  district  Aj  another  from  district  B,  and 
so  on,  these  districts  not  being  similar  as  regards  the  distribution 
of  hair-colour. 

The  above  conditions  were  only  tacitly  assumed  in  our  previous 
work,  and  consequently  it  has  been  necessary  to  emphasise  them 
specially.  The  third  condition  was  explicitly  stated :  (c)  The 
individual  "events,"  or  appearances  of  the  character  observed, 
must  be  completely  independent  of  one  another,  like  the  throws 
of  a  die,  or  sensibly  so,  like  the  drawings  of  balls  from  a  bag 
containing  a  number  of  balls  that  is  very  large  compared  with 
the  number  drawn.  Reverting  to  the  illustration  of  a  death-rate, 
our  formulae  would  not  apply  even  if  the  sample  populations 
were  composed  of  persons  of  one  age  and  one  sex,  if  we  were 
dealing,  for  example,  with  deaths  from  an  infectious  or  contagious 
disease.  For  if  one  person  in  a  certain  sample  has  contracted 
the  disease  in  question,  he  has  increased  the  possibility  of  others 
doing  so,  and  hence  of  dying  from  the  disease.  The  same  thing 
holds  good  for  certain  classes  of  deaths  from  accident,  e.g.  railway 
accidents  due  to  derailment,  and  explosions  in  mines :  if  such  an 
accident  is  fatal  to  one  person  it  is  probably  fatal  to  others  also, 
and  consequently  the  annual  returns  show  large  and  more  or 
less  erratic  variations. 

When  we  speak  of  simple  sampling  in  the  following  pages,  the 
term  is  intended  to  imply  the  fulfilment  of  all  the  conditions  (a), 
(b),  and  (c),  all  the  samples  and  all  the  individual  contributions  to 
each  sample  being  taken  under  precisely  the  same  conditions, 
and  the  individual  "  events  "  or  appearances  of  the  character  being 
quite  independent.  It  may  be  as  well  expressly  to  note  that  we 
need  not  make  any  assumption  as  to  the  conditions  that  determine 
p  unless  we  have  to  estimate  »Jnpq  a  priori.  If  we  draw  a 
sample  and  observe  in  it  the  actual  proportion  of,  say^  ^'s : 
draw  another  sample  under  precisely  the  same  conditions,  and 
observe  the  proportion  of  ^'s  in  the  two  samples  together :  add 
to  these  a  third  sample,  and  so  on,  we  will  find  that  p  approaches 
— not  continuously,  but  with  some  fluctuations — closer  and  closer 
to  some  limiting  value.  It  is  this  limiting  value  which  is  to  be 
used  in  our  formulae — the  value  of  p  that  would  be  observed  in 
a  very  large  sample.  The  standard-deviation  of  the  number  of 
sixes  thrown  with  n  dice,  on  this  understanding,  may  be  Jnpq, 
even  if  the  dice  be  out  of  truth  or  loaded  so  that  p  is  no  longer 
1/6.  Similarly,  the  standard-deviation  of  the  number  of  black 
balls  in  samples  of  n  drawn  from  an  infinitely  large  mixture  of 
black  and  white  balls  in  equal  proportions  may  be  Jnpq  even 
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if  p  is,  say,  1/3,  and  not  1/2  owing  to  the  black  balls,  for  some 
reason,  tending  to  slip  through  our  fingers.    {Cf.  Chap.  XIV. 

9.  It  is  evident  that  these  conditions  very  much  limit  the 
field  of  practical  cases  of  an  economic  or  sociological  character 
to  which  formulae  (1)  and  (2)  can  apply  without  considerable 
modification.  The  formulae  appear,  however,  to  hold  to  a  high 
degree  of  approximation  in  certain  biological  cases,  notably  in 
the  proportions  of  offspring  of  different  types  obtained  on  crossing 
hybrids,  and,  with  some  limitations,  to  the  proportions  of  the 
two  sexes  at  birth.  It  is  possible,  accordingly,  that  in  these  cases 
all  the  necessary  conditions  are  fulfilled,  but  this  is  not  a  necessary 
inference  from  the  mere  applicability  of  the  formulae  {cf.  Chap. 
XIV.  §  15).  In  the  case  of  the  sex-ratio  at  birth,  it  seems 
doubtful  whether  the  rule  applies  to  the  frequency  of  the  sexes  in 
individual  families  of  given  numbers  (ref.  9),  but  it  does  apply 
fairly  closely  to  the  sex-ratios  of  births  in  diflferent  localities, 
and  still  more  closely  to  the  ratios  in  one  locality  during 
successive  periods.  That  is  to  say,  if  we  note  the  number  of 
males  in  a  series  of  groups  of  n  births  each,  the  standard-deviation 
of  that  number  is  approximately  Jnpq^  where  p  is  the  chance 
of  a  male  birth ;  or,  otherwise,  Jpqjn  is  the  standard-deviation 
of  the  proportion  of  male  births.  We  are  not  able  to  assign  an 
a  priori  value  to  the  chance  p  as  in  the  case  of  dice-throwing, 
but  it  is  quite  sufficiently  accurate  for  practical  purposes  to  use 
the  proportion  of  male  births  actually  observed  if  that  proportion 
be  based  on  a  moderately  large  number  of  observations. 

10.  In  Table  VI.  of  Chap.  IX.  (p.  163)  was  given  a  correlation- 
table  between  the  total  numbers  of  births  in  the  registration  districts 
of  England  and  Wales  during  the  decade  1881-90  and  the  pro- 
portion of  male  births.  The  table  below  gives  some  similar  figures, 
based  on  the  same  data,  for  a  few  isolated  groups  of  districts  con- 
taining not  less  than  30  to  40  districts  each.  In  both  tables  the 
drop  in  dispersion  as  we  pass  from  the  small  to  the  large  districts 
is  extremely  striking.  The  actual  standard-deviations,  and  the 
standard-deviations  of  simple  sampling  corresponding  to  the  mid- 
numbers  of  births,  are  given  at  the  foot  of  the  table,  and  it  will 
be  seen  that  the  two  agree,  on  the  whole,  with  surprising  closeness, 
considering  the  small  numbers  of  observations.  The  actual 
standard-deviation  is,  however,  the  larger  of  the  two  in  every  case 
but  one.  The  corresponding  standard-deviations  for  Table  VI.  of 
Chap.  IX.  are  given  in  Qu.  7  at  the  end  of  this  chapter,  and  show 
the  same  general  agreement  with  the  standard-deviations  of  simple 
sampling;  the  actual  standard-deviations  are,  however,  again,  as 
a  rule,  slightly  in  excess  of  the  theoretical  values. 
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TAr.LE  showing  Frequencies  of  RegistratAon  Districts  in  England  and  Wales  with 
Different  Ratios  of  Male  to  Total  Births  during  the  Decade  1881-90,  for 
Groups  of  Districts  with  the  Numbers  of  Births  in  the  Decade  lying  between 
Certain  Limits.  [Data  based  on  Decennial  Supplement  to  Fifty-fifth  Annual 
Report  of  the  Registrar-General  for  England,  and  Wales.'] 


Number  of  Births 

in  Decade. 

Male  Births 

per  Thousand 
Total  Births, 

1500 

to 
2500. 

3500 

to 
4000. 

4500 

to 
5000. 

10,000 

to 
15,000. 

15,000 

to 
20,000. 

30,000 

to 
50,000. 

50,000 

to 
90,000. 

466-67 

1 

482-  3 

1 

492-  3 
494-  5 
496-  7 
498-  9 
500-  1 
502-  3 
504-  5 
506-  7 
508-  9 
510-  1 
512-  3 
514-  5 
516-  7 
518-  9 
520-  1 
522-  3 
524-  5 
526-  7 
528-  9 
530-  1 
532-  3 

0o4—  £) 

536-  7 

1 
1 

2 

2 
3 
3 
5 

4 
1 
2 

4 
1 
2 
1 
1 

1 

3 
1 
4 
3 
1 
5 
3 

Q 

o 
5 
2 
3 

1 

9 
1 

1 

1 
1 

2 
3 
3 
3 
3 
y 
2 
3 
3 
3 
1 
3 

— 

1 

3 

10 
6 
9 

10 

8 
10 
5 
4 

1 

1 

4 

6 
4 

0 

9 
2 
2 

1 

— 

1 

4 
6 
16 

Q 
O 

4 
3 
1 

— 

6 

10 
12 
5 
2 

— 
— 

Total 
Mean 
Standard  deviations 
Theo.  St.  deviation  ^ 
corresponding  to  I 
mean   births  SqJ 

36 
508-2 
12-8 

11-2 

38 
509-5 
8-53 

8-16 

40 
510-2 
7-12 

7-25 

73 
510-6 
4-98 

4-47 

33 
510-3 
3-87 

3-78 

43 
509-0 
3-22 

2-50 

35 
507-8 
2-20 

1-89 

6-2 

2-5 

2-2 

0-8 

2-0 

1-1 

*  The  meaning  of  this  ex])ression  is  explained  in  §  10  of  Chap.  XIV. 
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The  student  should  note  that  in  both  cases  the  standard-devia- 
tions given  are  standard-deviations  of  the  proportion  of  male 
births  joer  1000  of  all  births,  that  is,  1000  times  the  values  given 
by  equation  (2).  These  values  are  given  by  simply  substituting 
the  proportions  per  1000  for  and  q  in  the  formula.  Thus  for 
the  first  column  of  Table  I.  the  proportion  of  males  is  508  per 
1000  births,  the  mid-number  of  births  2000,  and  therefore — 


_/508x492Y 

~v~2ooo~y 


11.  In  the  above  illustration  the  difficulty  due  to  the  wide 
variation  in  the  number  of  births  n  in  different  districts  has  been 
surmounted  by  grouping  these  districts  in  limited  class  intervals, 
and  assuming  that  it  would  be  sufficiently  accurate  for  practical 
purposes  to  treat  all  the  districts  in  one  class  as  if  the  sex-ratios 
had  been  based  on  the  mid-numbers  of  births.  Given  a  sufficiently 
large  number  of  observations,  such  a  process  does  well  enough, 
though  it  is  not  very  good.  But  if  the  number  of  observations 
does  not  exceed,  perhaps,  50  or  60  altogether,  grouping  is 
obviously  out  of  the  question,  and  some  other  procedure  must  be 
adopted. 

Suppose,  then,  that  a  series  of  samples  have  been  taken  from 
the  same  material,  /j  samples  containing  individuals  or  observa- 
tions each,  fcj,  containing  n^,  containing  Tig,  and  so  on :  What 
would  be  the  standard-deviation  of  the  observed  proportions  in 
these  samples?  Evidently  the  square  of  the  standard-deviation 
in  the  first  group  would  he  pq/n^,  in  the  secondpq/n^,  and  so  on  : 
therefore,  as  the  means  tend  to  the  same  values  in  all  the  groups, 
we  must  have  for  the  whole  series — 


But  if  H  be  the  harmonic  mean  of  n,^  .  . 
and  accordingly 


IT  ^2 


 (3) 

That  is  to  say,  where  the  number  of  observations  varies  from  one 
sample  to  another,  the  harmonic  mean  number  of  observations  in 
a  sample  must  be  substituted  for  n  in  equation  (2). 

Thus  the  following  percentages  (taken  to  the  nearest  unit)  of 
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albinos  were  obtained  in  121  litters  from  hybrids  of  Japanese 
waltzing  mice  by  albinos,  crossed  inter  se  (A.  D.  Darbishire, 
Biometrika,  iii.  p.  30) : — 


Percentage. 

Frequency. 

Percentage. 

Frequency. 

0 

40 

40 

3 

14 

4 

43 

2 

17 

9 

50 

16 

20 

9 

57 

1 

22 

1 

60 

3 

25 

10 

67 

4 

29 

3 

80 

1 

33 

13 

100 

2 

The  distribution  is  very  irregular  owing  to  the  small  numbers  in 
the  litters,  and  the  standard-deviation  is  23-09  per  cent.  The 
numbers  of  litters  of  different  sizes  were  given  in  §  27  of  Chap. 
VII.  p.  128,  and  the  harmonic  mean  size  of  litter  was  found  to  be 
3*53.  The  expected  proportion  of  albinos  is  25  per  cent.,  and 
hence  the  standard-deviation  of  sampling  is 

in  very  close  agreement  with  the  actual  value.  The  proportion 
of  albinos  amongst  all  the  offspring  together  was  24 "7  per  cent. 

1 2.  If  one  of  the  two  proportions  p  and  q  become  very  small, 
equation  (1)  may  be  put  into  an  approximate  form  that  is  very 
useful.  Suppose  p  to  be  the  proportion  that  becomes  very  small, 
80  that  we  may  neglect  p"^  compared  with  p  :  then 

pq=p  -p^=p  approximately, 

and  consequently  we  have  approximately 

o-„=  Jn.p=  JM    .       ,       .  (4) 

That  is  to  say,  if  the  proportion  of  successes  he  small,  the 
standard-deviation  of  the  number  of  successes  is  the  square  root  of 
the  mean  number  of  successes.  Hence  we  can  find  the  standard- 
deviation  of  sampling  even  though  p  be  unknown,  provided  only 
we  know  that  it  is  small. 

Thus  (ref.  15)  in  10  Prussian  army  corps  in  20  years  (1875- 
1894)  there  were  122  men  killed  by  the  kick  of  a  horse,  or,  on  an 
average,  there  were  0  6 1  deaths  from  that  cause  in  each  army 
corps  annually.  From  equation  (4)  we  accordingly  have  for  the 
standard-deviation  of  simple  sampling 

o-  =  (0-61)*  =  0-78. 
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The  frequency-distribution  of  the  number  of  deaths  per  army 
corps  per  annum  was 

Deaths.  Frequency. 

0  109 

1  65 

2  22 

3  3 

4  1 

0-2  =  0-6079 
(r  =  0-78 


whence 


— an  almost  exact  agreement  with  the  standard-deviation  of  simple 
sampling. 

13.  We  may  now  turn  from  these  verifications  of  the  theoretical 
results  for  various  special  cases,  to  the  use  of  the  formulae  for 
checking  and  controlling  the  interpretation  of  statistical  results. 
If  we  observe,  in  a  statistical  sample,  a  certain  proportion  of 
objects  or  individuals  possessing  some  given  character — say  ^'s — 
this  proportion  differing  more  or  less  from  the  proportion  which 
for  some  reason  we  expected,  the  question  always  arises  whether 
the  difference  may  be  due  to  the  fluctuations  of  simple  sampling 
only,  or  may  be  indicative  of  definite  differences  between  the 
conditions  in  the  universe  from  which  the  sample  has  been  drawn 
and  the  assumed  conditions  on  which  we  based  our  expectation. 
Similarly,  if  we  observe  a  different  proportion  in  one  sample  from 
that  which  we  have  observed  in  another,  the  question  again  arises 
whether  this  difference  may  be  due  to  fluctuations  of  simple 
sampling  alone,  or  whether  it  indicates  a  difference  between  the 
conditions  subsisting  in  the  universes  from  which  the  two  samples 
were  drawn :  in  the  latter  case  the  difference  is  often  said  to  be 
significant.  These  questions  can  be  answered,  though  only  more 
or  less  roughly  at  present,  by  comparing  the  observed  difference 
with  the  standard-deviation  of  simple  sampling.  We  know 
roughly  that  the  great  bulk  at  least  of  the  fluctuations  of  samp- 
ling lie  within  a  range  of  ±  three  times  the  standard-deviation ; 
and  if  an  observed  difference  from  a  theoretical  result  greatly 
exceeds  these  limits  it  cannot  be  ascribed  to  a  fluctuation  of 
"  simple  sampling  "  as  defined  in  §  8  :  it  may  therefore  be  signifi- 
cant. The  "standard-deviation  of  simple  sampling"  being  the 
basis  of  all  such  work,  it  is  convenient  to  refer  to  it  by  a  shorter 
name.  The  observed  proportions  of  ^'s  in  given  samples  being 
regarded  as  differing  by  larger  or  smaller  errors  from  the  true 
proportion  in  a  very  large  sample  from  the  same  material,  the 
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"  standard-deviation  of  simple  sampling "  may  be  regarded  as  a 
measure  of  the  magnitude  of  such  errors,  and  may  be  called  ac- 
cordingly the  standard  error. 

Three  principal  tjases  of  comparison  may  be  distinguished. 

Case  I. — It  is  desired  to  know  whether  the  deviation  of  a  certain 
observed  number  or  proportion  from  an  expected  theoretical  value 
is  possibly  due  to  errors  of  sampling. 

In  this  case  the  observed  difference  is  to  be  compared  with  the 
standard  error  of  the  theoretical  number  or  proportion,  for  the 
number  of  observations  contained  in  the  sample. 

Example  i. — In  the  first  illustration  of  §  7,  25,145  throws  of  a  4, 
5,  or  6  were  made  in  lieu  of  the  24,576  expected  (out  of  49,152 
throws  altogether).  The  excess  is  569  throws.  Is  this  excess 
possibly  due  to  mere  fluctuations  of  sampling  % 

The  standard  error  is 


The  deviation  observed  is  5'1  times  the  standard  error,  and, 
practically  speaking,  could  not  occur  as  a  fluctuation  of  simple 
sampling.    It  may  perhaps  indicate  a  slight  bias  in  the  dice. 

The  problem  might,  of  course,  have  been  attacked  equally  well 
from  the  standpoint  of  the  proportion  in  lieu  of  the  absolute 
number  of  4's,  5's,  or  6's  thrown.  This  proportion  is  0-5116  instead 
of  the  theoretical  0  5000,  difference  in  excess  0*0116.  The 
standard  error  of  the  proportion  is 


and  the  difference  observed  bears  the  same  ratio  to  the  standard 
error  as  before,  as  of  course  it  must. 

Example  ii. — (Data  from  the  SeQond  Report  of  the  Evolution 
Committee  of  the  Royal  Society,  1905,  p.  72.) 

Certain  crosses  of  Pisum  sativum  gave  5321  yellow  and  1804 
green  seeds.  The  expectation  is  25  per  cent,  of  green  seeds,  or 
1781.  Can  the  divergence  from  the  exact  theoretical  result  have 
arisen  owing  to  errors  of  sampling  only  1 

The  numerical  difference  from  the  expected  result  is  23.  The 
standard  error  is 


Hence  the  divergence  from  theory  is  only  some  3/5  of  the 
standard  error,  and  may  very  well  have  arisen  owing  simply  to 
fluctuations  of  sampling. 


o-=  v/ix  1x49152 
=  110-9. 


<T=  x/0-25x  0-75x  7125  =  36-8. 
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Working  from  the  observed  proportion  of  green  seeds,  viz.  0  2532 
instead  of  the  theoretical  0  25,  we  have 

s=  s/0-¥5^0T5/7r2"5- 00051, 

and  similarly  the  divergence  from  theory  is  only  some  3/5  of  the 
standard  error,  as  before. 

It  should  be  noted  that  this  method  must  not  be  used  as  a  test 
of  association  by  comparing  the  difference  of  (AB)  from  {A)(B)/N 
with  a  standard  error  calculated  from  the  latter  value  as  a 
"theoretical  number,"  for  it  is  not  a  theoretical  number  given 
a  priori  as  in  the  above  illustrations,  and  (A)  and  (B)  are  themselves 
liable  to  errors  of  sampling.  If  we  formed  an  association-table 
between  the  results  of  tossing  two  coins  JV  times,  a  =  JI^.  J.  | 
would  be  the  standard  error  for  the  divergence  of  (AB)  from  the 
a  priori  value  n/i,  not  the  standard  error  for  differences  of  (AB) 
from  (A)(B)/]V,  (A)  and  (B)  being  the  numbers  of  heads  thrown 
in  the  case  of  the  first  and  the  second  coin  respectively. 

Case  II. — Two  samples  from  distinct  materials  or  different 
universes  give  proportions  of  ^'s  p^  and  p^,  the  numbers  of 
observations  in  the  samples  being  and  respectively,  (a)  Can 
the  difference  between  the  two  proportions  have  arisen  merely  as  a 
fluctuation  of  simple  sampling,  the  two  universes  being  really 
similar  as  regards  the  proportion  of  ^'s  therein?  (b)  If  the 
difference  indicated  were  a  real  one^  might  it  vanish,  owing  to 
fluctuations  of  sampling,  in  other  samples  taken  in  precisely  the 
same  way  1  This  case  corresponds  to  the  testing  of  an  association 
which  is  indicated  by  a  comparison  of  the  proportion  of  ^'s  amongst 
^'s  and  /3's 

(a)  We  have  no  theoretical  expectation  in  this  case  as  to  the 
proportion  of  ^'s  in  the  universe  from  which  either  sample  has 
been  taken. 

Let  us  find,  however,  whether  the  observed  difference  between  p-^ 
and  p.2  may  not  have  arisen  solely  as  a  fluctuation  of  simple 
sampling,  the  proportion  of  ^'s  being  really  the  same  in  both  cases, 
and  given,  let  us  say,  by  the  (weighted)  mean  proportion  in  our 
two  samples  together,  i.e.  by 

(the  best  guide  that  we  have). 

Let  €^  62  be  the  standard  errors  in  the  two  samples,  then 

If  the  samples  are  simple  samples  in  the  sense  of  the  previous 
work,  then  the  mean  difference  between      and  p^  will  be  zero, 
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and  the  standard  error  of  the  difference  Cjg,  the  samples  being 
independent,  will  be  given  by 

42=Poqo(l ....  (5) 

\7ll  112/ 

If  the  observed  difference  is  less  than  some  three  times  it 
may  have  arisen  as  a  fluctuation  of  simple  sampling  only. 

(b)  If,  on  the  other  hand,  the  proportions  of  ^'s  are  not  the  same 
in  the  material  from  which  the  two  samples  are  drawn,  but  and 
P2  are  the  true  values  of  the  proportions,  the  standard  errors  of 
sampling  in  the  two  cases  are 

and  consequently 

Pj3>^p^^     .     ,      .      .  (6) 

If  the  difference  between  and  does  not  exceed  some  three 
times  this  value  of  Cjg)  may  be  obliterated  by  an  error  of  simple 
sampling  on  taking  fresh  samples  in  the  same  way  from  the  same 
material. 

Further,  the  student  should  note  that  the  value  of  given  by 
equation  (6)  is  frequently  employed,  in  lieu  of  that  given  by 
equation  (5),  for  testing  the  significance  of  an  observed  difference. 
The  justification  of  this  usage  we  indicate  briefly  later  (Chap. 
XIV,  §  3).  Here  it  is  sufficient  to  state  that,  if  n  be  large, 
equation  (6)  gives  approximately  the  standard-deviation  of  the 
true  values  of  the  difference  for  a  given  observed  value,  and  hence, 
if  the  observed  difference  is  greater  than  some  three  times 
the  value  of  e^g  given  by  (6),  it  is  hardly  possible  that  the  true 
value  of  the  difference  can  be  zero.  The  difference  between  the 
values  of  given  by  (5)  and  (6)  is  indeed,  as  a  rule,  of  more 
theoretical  than  practical  importance,  for  they  do  not  differ  largely 
unless  and  p^  differ  largely,  and  in  that  case  either  formula  will 
place  the  difference  outside  the  range  of  fluctuations  of  sampling. 

Example  iii. — The  following  data  were  given  in  Qu.  3  of  Chap. 
III.  for  plants  of  Lobelia  fulgeiis  obtained  by  cross-  and  self-fertilisa- 
tion respectively: — 

Parentage  Cross- fertilised.  Parentage  Self- fertilised. 

Height—  Height- 
Above  Average.    Below  Average.       Above  Average.    Below  Average. 

17  17  12  22 

The  figures  indicate  an  association  between  tallness  and  cross- 
fertilisation  of  parentage.  Is  this  association  significant  of  some 
real  difference,  or  may  it  have  arisen  solely  as  an  "error  of 
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sampling  "  t  The  proportion  of  plants  above  average  height  in  the 
two  classes  (cross-  and  self-fertilised)  together  is  29/68.  The 
standard-deviation  of  the  dilTerences  due  to  simple  sampling 
between  the  proportions  of  "  tall  "  plants  in  two  samples  of  34 
observations  each  is  therefore 


or  12  0  per  cent.  The  actual  proportions  observed  are  50  per 
cent,  and  35  per  cent. — difference  15  per  cent.  As  this  difference 
is  only  slightly  in  excess  of  the  standard  error  of  the  difference, 
for  samples  of  34  observations  drawn  from  identical  material,  no 
definite  significance  could  be  attached  to  it — if  it  stood  alone. 

The  student  will  notice,  however,  that  all  the  other  cases  cited 
from  Darwin  in  the  question  referred  to  show  an  association  of 
the  same  sign,  but  rather  more  marked.  Hence  the  difference 
observed  may  be  a  real  one,  or  perhaps  the  real  difference  may  be 
greater  and  may  be  partially  masked  by  a  fluctuation  of  sampling. 
If  50  per  cent,  and  35  per  cent,  were  the  true  proportions  in  the 
two  classes,  the  standard  error  of  the  percentage  difference  would 
be,  by  equation  (6), 

/50  X  50    35  X  65\-^    , ,  ^ 
«i2  =  (^— 34— +  — =11'9  percent., 

and  consequently  the  actual  difference  might  not  infrequently  be 
completely  masked  by  fluctuations  of  sampling,  so  long  as  experi- 
ments were  only  conducted  on  the  same  small  scale. 

Example  iv. — (Data  from  J.  Gray,  Memoir  on  the  Pigmentation 
Survey  of  Scotland,  Jour,  of  the  Royal  Anthropological  Institute, 
vol.  xxxvii.,  1907.)  The  following  are  extracted  from  the  tables 
relating  to  hair-colour  of  girls  at  Edinburgh  and  Glasgow  : — 

Of  Medium  Total  Per  cent. 

Hair-colour.       observed.  Medium. 

Edinburgh    .       .         4,008  9,743  41-1 

Glasgow       .       .       17,529         39,764  44-1 

Can  the  difference  observed  in  the  percentage  of  girls  of  medium 
hair-colour  have  arisen  solely  through  fluctuations  of  sampling  1 

In  the  two  towns  together  the  percentage  of  girls  with  medium 
hair-colour  is  43 '5  per  cent.  If  this  were  the  true  percentage, 
the  standard  error  of  sampling  for  the  difference  between  per- 
centages observed  in  samples  of  the  above  sizes  would  be — 

c„  =  (43-5  X  56-5).  x(g4  +  3g^V 

=  0*56  per  cent. 


XIII. — SIMPLE  SAMPLING  OF  ATTRIBUTES.  271 


The  actual  difference  is  3-0  per  cent.,  or  over  5  times  this,  and 
could  not  have  arisen  through  the  chances  of  simple  sampling. 

If  we  assume  that  the  difference  is  a  real  one  and  calculate  the 
standard  error  by  equation  (6),  we  arrive  at  the  same  value,  viz. 
0*56  per  cent.  With  such  large  samples  the  difference  could  not, 
accordingly,  be  obliterated  by  the  fluctuations  of  simple  sampling 
alone. 

Case  III. — Two  samples  are  drawn  from  distinct  material  or 
different  universes,  as  in  the  last  case,  giving  proportions  of 
^'s  jOj  and  p.^,  but  in  lieu  of  comparing  the  proportion  with 

it  is  compared  with  the  proportion  of  in  the  two  samples 
together,  viz.  p^^  where,  as  before, 

Required  to  find  whether  the  difference  between  p^  and  p^  can 
have  arisen  as  a  fluctuation  of  simple  sampling,  p^  being  the 
true  proportion  of  ^'s  in  both  samples. 

This  case  corresponds  to  the  testing  of  an  association  which 
is  indicated  by  a  comparison  of  the  proportion  of  ^'s  amongst 
the  ^'s  with  the  proportion  of  ^'s  in  the  universe.  The  general 
treatment  is  similar  to  that  of  Case  II.,  but  the  work  is  complicated 
owing  to  the  fact  that  errors  in  p^  and  jo^  are  not  independent. 

If  be  the  standard  error  of  the  difference  between  jOj  and 
p^y  we  have  at  once 

4  =  €§  +  e? 


being  the  correlation  between  errors  of  simple  sampling  in 
p^  and  Pq.  But,  from  the  above  equation  relating  to  p-^ 
and  jOg,  writing  it  in  terms  of  deviations  in  p^  p^  and  p^, 
multiplying  by  the  deviation  in  p^  and  summing,  we  have, 
since  errors  in  p^  and  p^  are  uncorrelated, 


Therefore  finally 


01    ni  +  7i2 


Unless  the  difference  between  p^  and  p^  exceed,  say,  some 
three  times  this  value  of  it  may  have  arisen  solely  by  the 
chances  of  simple  sampling. 
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It  will  be  observed  that  if  be  very  small  compared  with 
Tig,  €01  approaches,  as  it  should,  the  standard  error  for  a  sample 
of  Tij  observations. 

We  omit,  in  this  case,  the  allied  problem  whether,  if  the 
difference  between  and  indicated  by  the  samples  were 
real,  it  might  be  wiped  out  in  other  samples  of  the  same  size 
by  fluctuations  of  simple  sampling  alone.  The  solution  is  a 
little  complex  as  we  no  longer  have  co^WoA^i +  ^2)- 

Example  v. — Taking  the  data  of  Example  iii.,  suppose  that 
we  compare  the  proportion  of  tall  plants  amongst  the  offspring 
resulting  from  cross-fertilisations  (viz.  50  per  cent.)  with  the 
proportion  amongst  all  offspring  (viz.  29/68,  or  42  6  per  cent.). 
As,  in  this  case,  both  the  subsamples  have  the  same  number 
of  observations,  tIj  =  Tig  =  34,  and 

(29    39     1  \* 
68  ^68 '^61;=*^  '^^^ 

or  6  per  cent.  As  in  the  working  of  Example  iii.,  the  observed  dif- 
ference is  only  1*25  times  the  standard  error  of  the  difference,  and 
consequently  it  may  have  arisen  as  a  mere  fluctuation  of  sampling. 

Example  vi. — Taking  now  the  figures  of  Example  iv.,  suppose 
that  we  had  compared  the  proportion  of  girls  of  medium  hair- 
colour  in  Edinburgh  with  the  proportion  in  Glasgow  and 
Edinburgh  together.  The  former  is  4ri  per  cent.,  the  latter 
43 '5  per  cent.,  difference  2*4  per  cent.  The  standard  error  of 
the  difference  between  the  percentages  observed  in  the  sub- 
sample  of  9743  observations  and  the  entire  sample  of  49,507 
observations  is  therefore 

=  (43-5  X  =  0-45  per  cent. 

The  actual  difference  is  over  five  times  this  (the  ratio  must,  of 
course,  be  the  same  as  in  Example  iv.),  and  could  not  have  occurred 
as  a  mere  error  of  sampling. 
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EXERCISES. 

1.  (Ref.  4  :  total  of  columns  of  all  the  13  tables  given.) 

Compare  the  actual  with  the  theoretical  mean  and  standard-deviation  for 
the  following  record  of  6500  throws  of  12  dice,  4,  5,  or  6  being  reckoned 
as  a  "  success. " 


Successes. 

0 
1 
2 
3 
4 
5 
6 


Frequency. 

1 
14 

103 

30i> 

711 
1231 
1411 


Successes. 

7 
8 


Frequency. 

1351 
844 
391 
117 
21 
3 

Total  6500 


2.  (Ref.  1.) 

Balls  were  drawn  from  a  bag  containing  equal  numbers  of  black  and  white 
balls,  each  ball  being  returned  before  drawing  another.  The  records  were  then 
grouped  by  counting  the  number  of  black  balls  in  consecutive  2's,  3's,  4's,  5's, 
etc.  The  following  give  the  distributions  so  derived  for  grouping  by  5's,  6's, 
and  7's.    Compare  actual  with  theoretical  means  and  standard-deviations. 


Successes. 

(a)  Grouping 

(b)  Grouping 

(c)  Grouping 

by  Fives. 

by  Sixes. 

by  Sevens. 

0 

30 

17 

9 

1 

125 

65 

34 

2 

277 

166 

104 

3 

224 

192 

151 

4 

136 

166 

148 

5 

27 

69 

95 

6 

8 

40 

7 

4 

Total 

819          1  683 

585 

3.  (Ref.  2,  p.  22.) 

Ten  thousand  drawings  of  a  ball  from  a  bag  containing  equal  numbers  of 
black  and  white  were  made  in  the  same  manner  as  in  the  preceding  example, 
and  then  grouped  into  100  sets  of  100.  The  following  gives  the  resulting 
frequency  of  different  numbers  of  white  balls.  Compare  mean  and  standard- 
deviation  with  theory. 


Number. 

Frequency. 

Number. 

Frequency. 

Number. 

Frequency. 

34 

1 

44 

8 

54 

8 

35 

45 

4 

55 

3 

36 

46 

5 

56 

5 

37 

47 

6 

57 

4 

38 

48 

5 

58 

4 

39 

1 

49 

11 

59 

40 

2 

50 

9 

60 

41 

2 

51 

5 

61 

1 

42 

52 

10 

62 

1 

4? 

I 

53 

4 

63 

1 
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4.  The  proportion  of  successes  in  the  data  of  Qu.  1  is  0  '5097.  Find  the  stand- 
ard-deviation of  the  proportion  with  the  given  number  of  throws,  and  state 
whether  you  would  regard  the  excess  of  successes  as  probably  significant  of  bias 
in  the  dice. 

5.  In  the  4096  drawings  on  which  Qu.  2  is  based  2030  balls  were  black 
and  2066  white.    Is  this  divergence  probably  significant  of  bias  ? 

6.  If  a  frequency-distribution  such  as  those  of  Questions  1,  2,  and  3  be  given, 
show  how  n  and  p,  if  unknown,  may  be  approximately  determined  from  the 
mean  and  standard-deviation  of  the  distribution. 

Find  n  andj3  in  this  way  from  the  data  of  Qu.  1  and  Qu.  3. 

7.  Verify  the  following  results  for  Table  VI.  of  Chapter  IX.  p.  163,  and 
compare  the  results  of  the  ditferent  grouping  of  the  table  on  p.  263.  In 
calculating  the  actual  standard-deviation,  use  Sheppard's  correction  for 
grouping  (p.  212). 


Actual 

Standard- 

Row  or  Rows, 

Mean. 

Standard- 
deviation  s. 

deviation  * 
of  Sampling  5q. 

1 

508-2 

11-60 

11-18 

2 

509-5 

6-79 

6-45 

3 

510-0 

5-28 

5-00 

4 

511-1 

5-03 

4-22 

5 

510-2 

3-67 

3-73 

6,7 

509-7 

4-13 

3-24 

8,  9,  10,  11 

508-7 

3-10 

2-69 

12,  13,  14 
15  and  upwards. 

508-4 

2-65 

2-25 

508-2 

2-13 

1-85 

8.  In  a  case  of  mice-breeding  (see  reference  given  in  §  11)  the  harmonic 
mean  number  in  a  litter  was  4-735,  and  the  expected  proportion  of  albinos 
50  per  cent.  Find  the  standard-deviation  of  simple  sampling  for  the  pro- 
portion of  albinos  in  a  litter,  and  state  whether  the  actual  standard-deviation 
(21*63  per  cent.)  probably  indicates  any  real  variation,  or  not. 

9.  (Data  from  Report  i..  Evolution  Committee  of  the  Royal  Society,  p.  17.) 
In  breeding  certain  stocks  408  hairy  and  126  glabrous  plants  were  obtained. 
If  the  expectation  is  one-fourth  glabrous,  is  the  divergence  significant,  or  might 
it  have  occurred  as  a  fluctuation  of  sampling  ? 

10.  (Data  of  Example  ix.  and  Qu.  5,  Chap.  III.)  Is  the  association  in 
either  of  the  following  cases  likely  to  have  arisen  as  a  fluctuation  of  simple 
sampling? 

(a)  {AB)  =  i7  {A&)  =  12  {aB)  =  21  (ai3)  =  3 

{b)  {AB)  =  309  {Afi)  =  2U  {aB)  =  U2  (a)3)  =  119 

11.  The  sex-ratio  at  birth  is  sometimes  given  by  the  ratio  of  male  to  female 
births,  instead  of  the  proportion  of  male  to  total  births.    If  Z  is  the  ratio,  i.  e. 


Z=plq,  show  that  the  standard  error  of  Z  is  approximately  (l+Z)' 


n  being  large,  so  that  deviations  are  small  compared  with  the  mean.  (The 
student  may  find  it  useful  to  refer  to  §  8,  Chap.  XI.) 


*  Based  on  the  mid-value  of  the  class-interval  for  single  rows,  or  the 
harmonic  mean  of  the  mid- values  for  groups  of  rows. 


CHAPTER  XIV 


SIMPLE  SAMPLING  CONTINUED:  EFFECT  OF 
REMOVING  THE  LIMITATIONS  OF  SIMPLE  SAMPLING. 

1.  Warning  as  to  the  assumption  that  three  times  the  standard  error  gives  the 
range  for  the  majority  of  fluctuations  of  simple  sampling  of  either  sign 
—2.  Warning  as  to  the  use  of  the  observed  for  the  true  value  of  p  in 
the  formula  for  the  standard  error — 3.  The  inverse  standard  error,  or 
standard  error  of  the  true  proportion  for  a  given  observed  proportion  : 
equivalence  of  the  direct  and  inverse  standard  errors  when  n  is  large — 
4-8.  The  importance  of  errors  other  than  fluctuations  of  "simple 
sampling"  in  practice:  unrepresentative  or  biassed  samples — 9-10. 
Effect  of  divergences  from  the  conditions  of  simple  sampling :  (a) 
efiect  of  variation  in  p  and  q  for  the  several  universes  from  wliich  the 
samples  are  drawn — 11-12.  (b)  Efl"ect  of  variation  in^  and  q  from  one 
sub-class  to  another  within  each  universe — 13-14.  (c)  Effect  of  a 
correlation  between  the  results  of  the  several  events — 15.  Summary. 

1.  There  are  two  warnings  as  regards  the  methods  adopted  in 
the  examples  in  the  concluding  section  of  the  last  chapter 
which  the  student  should  note,  as  they  may  become  of  importance 
when  the  number  of  observations  is  small.  In  the  fii*st  place,  he 
should  remember  that,  while  we  have  taken  three  times  the 
standard  error  as  giving  the  limits  within  which  the  great 
majority  of  errors  of  sampling  of  either  sign  are  contained, 
the  limits  are  not,  as  a  rule,  strictly  the  same  for  positive  and 
for  negative  errors.  As  is  evident  from  the  examples  of  actual 
distributions  in  §  7,  Chap.  XIIL,  the  distribution  of  errors  is  not 
strictly  symmetrical  unless  p  =  q  =  0'5.  No  theoretical  rule  as 
to  the  limits  can  be  given,  but  it  appears  from  the  examples 
referred  to  and  from  the  calculated  distributions  in  Chap.  XV. 
§  3,  that  a  range  of  three  times  the  standard  error  includes 
the  great  majority  of  the  deviations  in  the  direction  of  the 
longer  "  tail "  of  the  distribution,  while  the  same  range  on  the 
shortei  side  may  extend  beyond  the  limits  of  the  distribution 
altogether.  If,  therefore,  p  be  less  than  0  5,  our  assumed  range 
may  be  greater  than  is  possible  for  negative  errors,  or  if  p  be 
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greater  than  0*5,  greater  than  is  possible  for  positive  errors.  The 
assumption  is  not,  however,  likely  as  a  rule  to  lead  to  a  serious 
mistake ;  as  stated  at  the  commencement  of  this  paragraph,  the 
point  is  of  importance  only  when  n  is  small,  for  when  n  is  large  the 
distribution  tends  to  become  sensibly  symmetrical  even  for  values 
of  p  differing  considerably  from  0*5.  {Cf.  Chap.  XV.  for  the 
properties  of  the  limiting  form  of  distribution.) 

2.  In  the  second  place,  the  student  should  note  that,  where  we 
were  unable  to  assign  any  a  priori  value  to  p,  we  have  assumed 
that  it  is  sufficiently  accurate  to  replace  p  in  the  formula  for  the 
standard  error  by  the  proportion  actually  observed,  say  tt. 
Where  n  is  large  so  that  the  standard  error  of  p  becomes  small 
relatively  to  the  product  pq  the  assumption  is  justifiable,  and  no 
serious  error  is  possible.  If,  however,  n  be  small,  the  use  of  the 
observed  value  ir  may  lead  to  an  under-  or  over-estimation  of  the 
standard  error  which  cannot  be  neglected.  To  get  some  rough 
idea  of  the  possible  importance  of  such  effects,  the  approximate 
standard  error  e  may  first  be  calculated  as  usual  from  the 
observed  proportion  tt,  and  then  fresh  values  recalculated,  replac- 
ing TT  by  TT  ±  3c.  It  should  be  remembered  that  the  maximum 
value  of  the  product  pq  is  given  by/?  =  g'  =  0'5,  and  hence  these 
values,  if  within  the  limits  of  fluctuations  of  sampling,  will  give 
one  limiting  value  for  the  standard  error.  The  procedure  is  by 
no  means  exact,  but  may  serve  to  give  a  useful  warning. 

Thus  in  Example  iii.  of  Chap.  XIII.  the  observed  proportion  of 
tall  plants  is  29/68,  or,  say,  43  per  cent.  The  standard  error  of 
this  proportion  is  6  per  cent.,  and  a  true  proportion  of  50  per 
cent,  is  therefore  well  within  the  limits  of  fluctuations  of  sampling. 
The  maximum  value  of  the  standard  error  is  therefore 


/50X50V  nr,n 

( — ^    )  =6  06  per  cent. 


On  the  other  hand,  the  standard  error  is  unlikely  to  be  lower 
than  that  based  on  a  proportion  of  43-18  =  25  per  cent., 


/25  X  75V  . 

( — — 1  =5  25  per  cent. 


3.  The  two  difficulties  mentioned  in  §§  1  and  2  arise  when  n, 
the  number  of  cases  in  the  sample,  is  small.  The  interpretation 
of  the  value  of  the  standard  error  is  also  more  limited  in  this 
case  than  when  n  is  large.  Suppose  a  large  number  of  observa- 
tions to  be  made,  by  means  of  samples  of  n  observations  each,  on 
different  masses  of  material,  or  in  different  universes,  for  each  of 
which  the  true  value  of  p  is  known.    On  these  data  we  could 
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form  a  correlation-table  between  the  true  proportion  in  a  given 
universe  and  the  observed  proportion  tt  in  a  sample  of  n  observa- 
tions drawn  therefrom.  What  we  have  found  from  the  work  of 
the  last  chapter  is  that  the  standard-deviation  of  an  array  of  tt's 
associated  with  a  certain  true  value  jt),  in  this  table,  is  (pq/ny ; 
but  the  question  may  be  asked  —What  is  the  standard-deviation 
of  the  array  at  right  angles  to  this,  i.e.  the  array  of  ^'s  associated 
with  a  certain  observed  proportion  tt*?  In  other  words,  given  an 
observed  proportion  tt,  what  is  the  standard-deviation  of  the  true 
proportions  1  This  is  the  inverse  of  the  problem  with  which  we 
have  been  dealing,  and  it  is  a  much  more  difficult  problem. 
On  general  principles,  however,  we  can  see  that  if  n  be  large, 
the  two  standard-deviations  will  tend,  on  the  average  of  all 
values  of  p,  to  be  nearly  the  same,  while  if  n  be  small  the  standard- 
deviation  of  the  array  of  tt's  will  tend  to  be  appreciably  the 
greater  of  the  two.  For  if  tt  —p  +  S,  8  is  uncorrelated  with  p, 
and  therefore  if  o-p  be  the  standard-deviation  of  p  in  all  the 
universes  from  which  samples  are  drawn,  a-n  the  standard- 
deviation  of  observed  proportions  in  the  samples,  and  era  the 
standard-deviation  of  the  differences, 

But  o-|  varies  inversely  as  n.  Hence  if  n  become  very  large,  (ts 
becomes  very  small,  o-^-  becomes  sensibly  equal  to  cr^„  and  therefore 
the  standard-deviations  of  the  arrays,  on  an  average,  are  also 
sensibly  equal.  If  n  be  large,  therefore,  [7r(l  -  7r)/n]^  may  be 
taken  as  giving,  with  sufficient  exactness,  the  standard-deviation 
of  the  true  proportion  p  for  a  given  observed  proportion  tt.  But 
if  n  be  small,  o-§  cannot  be  neglected  in  comparison  with  o-^,  (r„  is 
therefore  appreciably  greater  than  a-p,  and  the  standard-deviation 
of  the  array  of  tt's  is,  on  an  average  of  all  arrays,  correspondingly 
greater  than  the  standard  deviation  of  the  array  of  jo's — the  state- 
ment is  not  true  for  every  pair  of  corresponding  arrays,  especially 
for  extreme  values  of  p  near  0  and  1.  Further,  it  should  be 
noticed  that,  while  the  regression  of  tt  on  p  is  unity — i.e.  the 
mean  of  the  array  of  tt's  is  identical  with  p,  the  type  of  the 
array — the  regression  of  ^  on  tt  is  less  than  unity.  If  we  as- 
sume, therefore,  that  a  tabulation  of  all  possible  chances,  observed 
for  every  conceivable  subject,  would  give  a  distribution  of  p 
ranging  uniformly  between  0  and  1,  or  indeed  grouped  symmetri- 
cally in  any  way  round  0-5,  any  observed  value  tt  greater  than 
0*5  will  probably  correspond  to  a  true  value  of  p  slightly  lower 
than  TT,  and  conversely.  We  have  already  referred  to  the  use  of 
the  inverse  standard  error  in  §  13  of  Chap.  XIII.  (Case  II.,  p.  269). 
If  we  determine,  for  example,  the  standard  error  of  the  difference 
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between  two  observed  proportions  by  equation  (6)  of  that  chapter, 
this  may  be  taken,  provided  n  be  large,  as  approximately  the 
standard-deviation  of  true  differences  for  the  given  observed 
difference. 

4.  The  use  of  standard  errors  must  be  exercised  with  care.  It 
is  very  necessary  to  remember  the  limited  assumptions  on  which 
the  theory  of  simple  sampling  is  based,  and  to  bear  in  mind  that 
it  covers  those  fluctuations  alone  which  exist  when  all  the  assumed 
conditions  are  fulfilled.  The  formulae  obtained  for  the  standard 
errors  of  proportions  and  of  their  differences  have  no  bearing 
except  on  the  one  question,  whether  an  observed  divergence  of  a 
certain  proportion  from  a  certain  other  proportion  that  might  be 
observed  in  a  more  extended  series  of  observations,  or  that  has 
actually  been  observed  in  some  other  series,  might  or  might  not 
be  due  to  fluctuations  of  simple  sampling  alone.  Their  use  is 
thus  quite  restricted,  for  in  many  cases  of  practical  sampling  this 
is  not  the  principal  question  at  issue.  The  principal  question  in 
many  such  cases  concerns  quite  a  different  point,  viz.  whether  the 
observed  proportion  tt  in  the  sample  may  not  diverge  from  the 
proportion  p  existing  in  the  universe  from  which  it  was  drawn, 
owing  to  the  nature  of  the  conditions  under  which  the  sample  was 
taken,  tt  tending  to  be  definitely  greater  or  definitely  less  than 
p.  Such  divergence  between  tt  and  p  might  arise  in  two  distinct 
ways,  (1)  owing  to  variations  of  classification  in  sorting  the 
^'s  and  a's,  the  characters  not  being  well  defined — a  source  of 
error  which  we  need  not  further  discuss,  but  one  which  may  lead 
to  serious  results  [c/.  ref.  5  of  Chap.  V.].  (2)  Owmg  to  either  ^'s 
or  a's  tending  to  escape  the  attentions  of  the  sampler.  To  give 
an  illustration  from  artificial  chance,  if  on  drawing  samples  from 
a  bag  containing  a  very  large  number  of  black  and  white  balls 
the  observed  proportion  of  black  balls  was  tt,  we  could  not 
necessarily  infer  that  the  proportion  of  black  balls  in  the  bag  was 
approximately  tt,  even  though  the  standard  error  were  small,  and 
we  knew  that  the  proportions  in  successive  samples  were  subject 
to  the  law  of  simple  sampling.  For  the  black  balls  might  be, 
say,  much  more  highly  polished  than  the  white  ones,  so  as  to 
tend  to  escape  the  fingers  of  the  sampler,  or  they  might  be  re- 
presented by  a  number  of  lively  black  insects  sheltering  amongst 
white  stones :  in  neither  case  would  the  ratio  of  black  balls  to 
white,  or  of  insects  to  stones,  be  represented  in  their  proper  pro- 
portions. Clearly,  in  any  parallel  case,  inferences  as  to  the 
material  from  which  the  sample  is  drawn  are  of  a  very  doubtful 
and  uncertain  kind,  and  it  is  this  uncertainty  whether  the  chance 
of  inclusion  in  the  sample  is  the  same  for  ^'s  and  a's,  far  more 
than  the  mere  divergences  between  different  samples  drawn  in 
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the  same  way,  which  renders  many  statistical  results  based  oi, 
samples  so  dubious. 

5.  Thus  in  collecting  returns  as  to  family  income  and  expendi- 
ture from  working-class  households,  the  families  with  lower 
incomes  are  almost  certain  to  be  under-represented ;  they  largely 
"escape  the  sampler's  fingers"  from  their  simple  lack  of  ability 
to  keep  the  necessary  accounts.  It  is  almost  impossible  to  say, 
however,  to  what  extent  they  are  under-represented,  or  to  form 
any  estimate  as  to  the  possible  error  when  two  such  samples 
taken  by  different  persons  at  different  times,  or  in  different  places, 
are  compared.  Again,  if  estimates  as  to  crop-production  are 
formed  on  the  basis  of  a  limited  number  of  voluntary  returns, 
the  estimates  are  likely  to  err  in  excess,  as  the  persons  who 
make  the  returns  will  probably  include  an  undue  proportion 
of  the  more  intelligent  farmers  whose  crops  will  tend  to  be 
above  average.  Whilst  voluntary  returns  are  in  this  way  liable 
to  lead  to  more  or  less  unrepresentative  samples,  compulsory 
sampling  does  not  evade  the  difficulty.  Compulsion  could  not  en- 
sure equally  accurate  and  trustworthy  returns  from  illiterate 
and  well-educated  workmen,  from  intelligent  and  unintelligent 
farmers.  The  following  of  some  definite  rule  in  drawing  the 
sample  may  also  produce  unrepresentative  samples :  if  samples 
of  fruit  were  taken  solely  from  the  top  layers  of  baskets  exposed 
for  sale,  the  results  might  be  unduly  favourable  ;  if  from  the 
bottom  layer,  unduly  unfavourable. 

6.  In  such  cases  we  can  see  that  any  sample,  taken  in  the 
way  supposed,  is  likely  to  be  definitely  biassed,  in  the  sense 
that  it  will  not  tend  to  include,  even  in  the  long  run,  equal 
proportions  of  the  ^'s  and  a's  in  the  original  material.  In  other 
cases  there  may  be  no  obvious  reason  for  presuming  such  bias, 
but,  on  the  other  hand,  no  certainty  that  it  does  not  exist.  Thus 
if  we  noted  the  hair-colours  of  the  children  in,  say,  one 
school  in  ten  in  a  large  town,  the  question  would  arise  whether 
this  method  would  tend  to  give  an  unbiassed  sample  of  all  the 
children.  No  assured  answer  could  be  given :  conjectures  on 
the  matter  would  be  based  in  part  on  the  way  in  which  the 
schools  were  selected,  e.g.  the  volunteering  of  teachers  for  the  work 
might  in  itself  introduce  an  element  of  bias.  Again,  if  say 
10,000  herrings  were  measured  as  landed  at  various  North  Sea 
ports,  and  the  question  were  raised  whether  the  sample  was 
likely  to  be  an  unbiassed  sample  of  North  Sea  herrings,  no 
assured  answer  could  be  given.  There  may  be  no  definite  reason 
for  expecting  definite  bias  in  either  case,  but  it  may  exist,  and 
no  mere  examination  of  the  sample  itself  can  give  any  informa- 
tion as  to  whether  it  exists  or  no. 
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7.  Such  an  examination  may  be  of  service,  however,  as 
indicating  one  possible  source  of  bias,  viz.  great  heterogeneity  in 
the  original  material.  If,  for  example,  in  the  first  illustration, 
the  hair-colours  of  the  children  differed  largely  in  the  different 
schools — much  more  largely  than  would  be  accounted  for  by 
fluctuations  of  simple  sampling — it  would  be  obvious  that  one 
school  would  tend  to  give  an  unrepresentative  sample,  and 
questionable  therefore  whether  the  five,  ten  or  fifteen  schools 
observed  might  not  also  have  given  an  unrepresentative  sample. 
Similarly,  if  the  herrings  in  different  catches  varied  largely,  it 
would,  again,  be  difficult  to  get  a  representative  sample  for  a 
large  area.  But  while  the  dissimilarity  of  subsamples  would 
then  be  evidence  as  to  the  difficulty  of  obtaining  a  representative 
sample,  the  similarity  of  subsamples  would,  of  course,  be  no 
evidence  that  the  sample  was  representative,  for  some  very 
different  material  which  should  have  been  represented  might 
have  been  missed  or  overlooked. 

8.  The  student  must  therefore  be  very  careful  to  remember 
that  even  if  some  observed  difference  exceed  the  limits  of  fluctua- 
tion in  simple  sampling,  it  does  not  follow  that  it  exceeds  the 
limits  of  fluctuation  due  to  what  the  practical  man  would  regard  — 
and  quite  rightly  regard — as  the  chances  of  sampling.  Further, 
he  must  remember  that  if  the  standard  error  be  small,  it  by  no 
means  follows  that  the  result  is  necessarily  trustworthy  :  the 
smallness  of  the  standard  error  only  indicates  that  it  is  not 
VMtrustworthy  owing  to  the  magnityde  of  fluctuations  of  simple 
sampling.  It  may  be  quite  untrustworthy  for  other  reasons : 
owing  to  bias  in  taking  the  sample,  for  instance,  or  owing  to  definite 
errors  in  classifying  the  ^'s  and  a's.  On  the  other  hand,  of  course, 
it  should  also  be  borne  in  mind  that  an  observed  proportion  is  not 
necessarily  incorrect,  but  merely  to  a  greater  or  less  extent 
untrustworthy  if  the  standard  error  be  large.  Similarly,  if  an 
observed  proportion  ttj  in  a  sample  drawn  from  one  universe  be 
greater  than  an  observed  proportion  TTg  in  a  sample  drawn  from 
another  universe,  but  tt^  -  ttc^  is  considerably  less  than  three  times 
the  standard  error  of  the  difference,  it  does  not,  of  course,  follow 
that  the  true  proportion  for  the  given  universes,  p-^  and  p^,  are 
most  probably  equal.  On  the  contrary,  p-^  most  likely  exceeds  p^ ; 
the  standard  error  only  warns  us  that  this  conclusion  is  more  or 
less  uncertain,  and  ihdX  possibly  p,^^  may  even  exceed  ^j. 

9.  Let  us  now  consider  the  effect,  on  the  standard-deviation  of 
sampling,  of  divergences  from  the  conditions  of  simple  sampling 
which  were  laid  down  in  §  8  of  Chap.  XIII. 

First  suppose  the  condition  (a)  to  break  down,  so  that  there  is 
some  essential  difference  between  the  localities  from  which,  or  the 
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conditioijs  under  which,  samples  are  drawn,  or  that  some  essential 
change  has  taken  place  during  the  period  of  sampling.  We  may 
represent  such  circumstances  in  a  case  of  artificial  chance  by 
supposing  that  for  the  first  throws  of  n  dice  the  chance  of 
success  for  each  die  is^j,  for  the  next/2  throws  p^^  for  the  next/g 
throws  and  so  on,  the  chance  of  success  varying  from  time  to 
time,  just  as  the  chance  of  death,  even  for  individuals  of  the  same 
age  and  sex,  varies  from  district  to  district.  Suppose,  now,  that 
the  records  of  all  these  throws  are  pooled  together.  The  mean 
number  of  successes  per  throw  of  the  n  dice  is  given  by 

where  N=  is  the  whole  number  of  throws  and  p^  is  the  mean 
value  ^(fp)/M  of  the  varying  chance  p.  To  find  the  standard- 
deviation  of  the  number  of  successes  at  each  throw  consider  that 
the  first  set  of  throws  contributes  to  the  sum  of  the  squares  of 
deviations  an  amount 

^•Pi^i  heing  the  square  of  the  standard-deviation  for  these  throws, 
and  n(p^  ~Po)  the  difference  between  the  mean  number  of 
successes  for  the  first  set  and  the  mean  for  all  the  sets  together. 
Hence  the  standard-deviation  o-  of  the  whole  distribution  is  given 
by  the  sum  of  all  quantities  like  the  above,  or 

=  n^fpq)  +     ^f(p -p,f. 

Let  a-p  be  the  standard-deviation  of  then  the  last  sum  is 
J^f.n^crl,  and  substituting  1  -/?  for     we  have 

a'^  =  npQ  -  npl  -  tio^  -I-  ti^o^ 

-=npf,qQ  +  n{n-\)a-l   .        .       ,        .  (1) 

This  is  the  formula  corresponding  to  equation  (1)  of  Chap. 
XIII. :  if  we  deal  with  the  standard-deviation  of  the  proportion 
of  successes,  instead  of  that  of  the  absolute  number,  we  have, 
dividing  through  by  w^,  the  formula  corresponding  to  equation 
(2)  of  Chap.  XIIL,  viz.— 

n        n  ^ 

10.  If  w  be  large  and  be  the  standard-deviation  calculated 
from  the  mean  proportion  of  successes  p^^  equation  (2)  is  sensibly 
of  the  form 

s2  =  «2^o^, 
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Tablb  showing  Freqiiencus  of  Kegisti-ation  Districts  in  England  and  JFales 
with  Different  Proportions  of  Deaths  in  Childbirth  {including  Deaths 
from  Puerperal  Fever)  per  1000  Births  in  the  same  Year,  for  the  same 
Groups  of  Districts  as  in  the  Table  of  CJiap.  XIII.  §  10.  Data  from  same 
source.    Decade  1881-90. 


Number  of  Births  in  the  Decade. 


jjeatJis  111 

UniiaDirtn  per 
lUUU  rSirtns. 

1500 
to 

3500 
to 

4500 
to 

10,000 
to 

15,000 
to 

30,000 
to 

50,000 
to 

2500. 

4000. 

5000. 

15,000. 

20,000. 

50,000. 

90,000. 

1-5-  2  0 

2 



2-0-  2-5 

1 

_ 

1 

1 

2-5-  3-0 

1 

3 

1 

„ 





3-0-  3-5 

1 

5 

2 

4 

1 

2 

3-5-  4-0 

5 

6 

5 

8 

5 

5 

9 

4-0-  4-5 

6 

5 

8 

23 

4 

9 

6 

4-5-  5-0 

2 

5 

9 

14 

11 

7 

5 

5-0-  5-5 

7 

3 

6 

14 

6 

8 

7 

5-5-  6  0 

5 

3 

4 

5 

2 

5 

4 

8-0-  6-5 

1 

5 

4 

1 

1 

6'5-  7-0 

3 

1 

1 

3 

2 

1 

7-0-  7-5 

1 

1 

— 

„ 

4 

7-5-  8-0 

— 

— 

1 

8-0-  8-5 

— 

— 

-  - 

— 

8-5-  9-0 

1 

1 

— 

__ 

1 

— 

9  0-  9-5 

9 -5-10 -0 

1 

~ 

— 

1 

— 

— 

- 

1  0-0-1  o-^i 

10-5-1  ro 

1 

Total 

36 

38 

40 

73 

33 

43 

35 

Mean 

5-29 

4-71 

4-45 

4-68 

4-99 

513 

4-64 

Standard  -  de-  > 
viation  I 

1-77 

1-37 

109 

1-01 

0-99 

1-12 

0-87 

Theoreticar 

standard  -de- 

viation corre-  • 

1-62 

1-12 

0-97 

0-61 

0-53 

0-36 

0-26 

sponding  to 

mean  births 

0-71 

0-80 

0-51 

0-80 

0-84 

1-07 

0-83 

and  hence,  knowing  s  and  s^,  we  can  find  a-p  the  standard-deviation 
of  the  chance  or  proportion  in  the  universes  from  which  the 
samples  have  been  drawn. 

The  values  of  Js^  -  si  are  tabulated  at  the  foot  of  the  table 
showing  the  distribution  of  the  proportion  of  male  births  in 
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certain  registration  districts  of  England,  in  §  10  of  Chap.  XIII. 
p.  263.  It  will  be  seen  that  in  the  first  group  of  small  districts 
there  appears  to  be  a  significant  standard-deviation  of  some  6 
units  in  the  proportion  of  male  births  per  thousand,  but  in  the 
more  urban  districts  this  falls  to  1  or  2  units  ;  in  one  case  only 
does  s  fall  short  of  s^.  In  the  table  on  p.  283  are  given  some 
different  data  relating  to  the  deaths  of  women  in  childbirth  in  the 
same  groups  of  districts,  and  in  this  case  the  effect  of  definite 
causes  is  relatively  larger,  as  one  might  expect.  The  values  of 
Js"^  -  si  suggest  an  almost  uniform  significant  standard-deviation 
a-p  =  0*8  in  the  deaths  of  women  per  thousand  births,  five  out  of 
the  eight  values  being  very  close  to  this  average.  The  figures  of 
this  case  also  bring  out  clearly  one  important  consequence  of  (2), 
viz.  that  if  we  make  n  large  s  becomes  sensibly  equal  to  a-p,  while 
if  we  make  n  small  s  becomes  more  nearly  equal  to  PQqJn.  Hence 
if  we  want  to  know  the  significant  standard-deviation  of  the  pro- 
portion p — the  measure  of  its  fluctuation  owing  to  definite  causes 
— n  should  be  made  as  large  as  possible ;  if,  on  the  other  hand,  we 
want  to  obtain  good  illustrations  of  the  theory  of  simple  sampling 
n  should  be  made  small.  If  n  be  very  large  the  actual  standard- 
deviation  may  evidently  become  almost  indefinitely  large  com- 
pared with  the  standard-deviation  of  sampling.  Thus  during  the 
20  years  1855-74  the  death-rate  in  England  and  Wales  fluctuated 
round  a  mean  value  of  22*2  per  thousand  with  a  standard-devia- 
tion of  0*86.  Taking  the  mean  population  as  roughly  21  millions, 
the  standard-deviation  of  sampling  is  approximately 


This  is  only  about  one  twenty-seventh  of  the  actual  value. 

11.  Now  consider  the  effect  of  altering  the  second  condition 
of  simple  sampling,  given  in  §  8  (6)  of  Chapter  XIII.,  viz.  the 
condition  that  the  chances  p  and  q  shall  be  the  same  for  every 
die  or  coin  in  the  set,  or  the  circumstances  that  regulate  the 
appearance  of  the  character  observed  the  same  for  every  individual 
or  every  sub-class  in  each  of  the  universes  from  which  samples 
are  drawn.  Suppose  that  in  the  group  of  n  dice  thrown  the 
chances  for  mj  dice  are  p^  q-^ ;  for  dice,  q^,  and  so  on, 
the  chances  varying  for  different  dice,  but  being  constant 
throughout  the  experiment.  The  case  differs  from  the  last,  as 
in  that  the  chances  were  the  same  for  every  die,  at  any  one 
throw,  but  varied  from  one  throw  to  another:  now  they  are  con- 
stant from  throw  to  throw,  but  differ  from  one  die  to  another  as 
they  would  in  any  ordinary  set  of  badly  made  dice.  Required  to 
find  the  eflfect  of  these  differing  chances. 
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For  the  mean  number  of  successes  we  evidently  have 
if  =mj/>j  +  m2i02  +  ^8i^3+  .... 

Pq  being  the  mean  chance  ^(mp)/n.  To  find  the  standard-deviation 
of  the  number  of  successes  at  each  throw,  it  should  be  noted  that 
this  may  be  regarded  as  made  up  of  the  number  of  successes  in 
the  mj  dice  for  which  the  chances  are  p^  q^,  together  with  the 
number  of  successes  amongst  the  m.^  dice  for  which  the  chances 
are  p.^  q^,  and  so  on :  and  these  numbers  of  successes  are  all 
independent.  Hence 

(r-  =  m^p^q^  +  m^p^q.;,  +  m^^q^+  .... 
=  2{mpq), 

Substituting  I  ~p  for  q,  as  before,  and  using  a-^,  to  denote  the 
standard-deviation  of  p, 

a-'^^n.p^q^-na-l     .        .        .        •  (3) 

or  if  s  be,  as  before,  the  standard-deviation  of  the  proportion  of 
successes, 

n       71         '        '       '        *    V  / 

12.  The  effect  of  the  chances  varying  for  the  individual  dice  or 
other  "events"  is  therefore  to  lower  the  standard-deviation,  as 
calculated  from  the  mean  proportion  p^^,  and  the  effect  may 
conceivably  be  considerable.  To  take  a  limiting  case,  if  p  be  zero 
for  half  the  events  and  unity  for  the  remainder,  p^=zqQ  =  i,  and 
(r^=^j  so  that  s  is  zero.  To  take  another  illustration,  still  some- 
what extreme,  if  the  values  of  p  are  uniformly  distributed  over 
the  whole  range  between  0  and  1,  JOo  =  ^o  =  i  before  but  o-^  = 
1/12  =  0-0833  (Chap.  VIII.  §  J2,  p.  143).  Hence  s2  =  0-1667/71, 
«  =  0-408/\/w,  instead  of  0'5/Jn,  the  value  of  s  if  the  chances  are 
J  in  every  case.  In  most  practical  cases,  however,  the  effect  will  be 
much  less.  Thus  the  standard-deviation  of  sampling  for  a  death- 
rate  of,  say,  1 8  per  thousand  in  a  population  of  uniform  age  and 
one  sex  is  (18  x  982y/Jn=  133/ Jn.  In  a  population  of  the  age 
composition  of  that  of  England  and  Wales,  however,  the  death- 
rate  is  not,  of  course,  uniform,  but  varies  from  a  high  value  in 
infancy  (say  150  per  thousand),  through  very  low  values  (2  to  4 
per  thousand)  in  childhood  to  continuously  increasing  values  in 
old  age  ;  the  standard-deviation  of  the  rate  within  such  a  popula- 
tion is  roughly  about  30  per  thousand.    But  the  effect  of  this 
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variation  on  the  standard-deviation  of  simple  sampling  is  quite 
small,  for,  as  calculated  from  equation  (4), 

52  =  1(18  x  982  -  900) 

as  compared  with  133 / J n. 

13.  We  have  finally  to  pass  to  the  third  condition  (c)  of  §  8,  Chap. 
XIII.,  and  to  discuss  the  effect  of  a  certain  amount  of  dependence 
between  the  several  "  events  "  in  each  sample.  We  shall  suppose, 
however,  that  the  two  other  conditions  (a)  and  (b)  are  fulfilled, 
the  chances  p  and  q  being  the  same  for  every  event  at  every  trial, 
and  constant  throughout  the  experiment.  The  problem  is  again 
most  simply  treated  on  the  lines  of  §  5  of  the  last  chapter.  The 
standard-deviation  for  each  event  is  {pqy  as  before,  but  the  events 
are  no  longer  independent :  instead,  therefore,  of  the  simple 
expression 

we  must  have  (c/.  Chap.  XI.  §  2) 

<T^  =  n.pq  +  2pq{r^^  +  r^^+  ....  r^^+  ....), 

where,  r-^^,  r^g,  etc.  are  the  correlations  between  the  results  of  the 
first  and  second,  first  and  third  events,  and  so  on — correlations 
for  variables  (number  of  successes)  which  can  only  take  the 
values  0  and  1,  but  may  nevertheless,  of  course,  be  treated  as 
ordinary  variables  (c/.  Chap.  XI.  §  10).  There  are  n{n -l)/2 
correlation-coefficients,  and  if,  therefore,  r  is  the  arithmetic  mean 
of  the  correlations  we  may  write 

(r'^  =  npq[l+r{n-l)].        .        .        .  (5) 

The  standard-deviation  of  simple  sampling  will  therefore  be 
increased  or  diminished  according  as  the  average  correlation 
between  the  results  of  the  single  events  is  positive  or  negative, 
and  the  effect  may  be  considerable,  as  o-  may  be  reduced  to  zero 
or  increased  to  n(pqy.  For  the  standard  deviation  of  the  propor- 
tion of  successes  in  each  sample  we  have  the  equation 

s2=^[l+r(«-l)]  .       .       .       .  (6) 

It  should  be  noted  that,  as  the  means  and  standard-deviations 
for  our  variables  are  all  identical,  r  is  the  correlation-coefficient 
for  a  table  formed  by  taking  all  possible  pairs  of  results  in  the 
n  events  of  each  sample. 
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It  should  also  be  noted  that  the  case  when  r  is  positive  covers 
the  departure  from  the  rules  of  simple  sampling  discussed  in 
§§  9-10 :  for  if  we  draw  successive  samples  from  different  records, 
this  introduces  the  positive  correlation  at  once,  even  although  the 
results  of  the  events  at  each  trial  are  quite  independent  of  one 
another.  Similarly,  the  case  discussed  in  §§  11-12  is  covered  by 
the  case  when  r  is  negative  :  for  if  the  chances  are  not  the  same 
for  every  event  at  each  trial,  and  the  chance  of  success  for  some 
one  event  is  above  the  average,  the  mean  chance  of  success  for  the 
remainder  must  be  below  it.  The  cases  (a),  {b)  and  (c)  are,  how- 
ever, best  kept  distinct,  since  a  positive  or  negative  correlation 
may  arise  for  reasons  quite  different  from  those  discussed  in 


14.  As  a  simple  illustration,  consider  the  important  case  of 
sampling  from  a  limited  universe,  e.g.  of  drawing  n  balls  in 
succession  from  the  whole  number  w  m  2^.  bag  containing  joif  white 
balls  and  qw  black  balls.  On  repeating  such  drawings  a  large 
number  of  times,  we  are  evidently  equally  likely  to  get  a  white 
ball  or  a  black  ball  for  the  first,  second,  or  nth  ball  of  the  sample : 
the  correlation-table  formed  from  all  possible  pairs  of  every  sample 
will  therefore  tend  in  the  long  run  to  give  just  the  same  form  of 
distribution  as  the  correlation-table  formed  from  all  possible  pairs 
of  the  w  balls  in  the  bag.  But  from  Chap.  XL  §  11  we 
know  that  the  correlation-coefficient  for  this  table  is  -  l/{w-  1), 
whence 


If  71=  1,  we  have  the  obviously  correct  result  that  (r  =  (pq)\  as 
in  drawing  from  unlimited  material :  if,  on  the  other  hand,  n  =  w, 
cr  becomes  zero  as  it  should,  and  the  formula  is  thus  checked  for 
simple  cases.  For  drawing  2  balls  out  of  4,  a  becomes  0*816 
{npq)^ ;  for  drawing  5  balls  out  of  10,  0*745  (npqY ;  in  the  case 
of  drawing  half  the  balls  out  of  a  very  large  number,  it  approxi- 
mates to  (O'b.npqYf  or  0'707  {npq)K 

In  the  case  of  contagious  or  infectious  diseases,  or  of  certain 
forms  of  accident  that  are  apt,  if  fatal  at  all,  to  result  in  whole- 
sale deaths,  r  is  positive,  and  if  n  be  large  (as  it  usually  is  in  such 
cases)  a  very  small  value  of  r  may  easily  lead  to  a  very  great  increase 
in  the  observed  standard-deviation.  It  is  difficult  to  give  a  really 
good  example  from  actual  statistics,  as  the  conditions  are  hardly 
ever  constant  from  one  year  to  another,  but  the  following  will 


§§  9-12. 


w  -n 


=  n.pq- 


1  • 

w  -~1 
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serve  to  illustrate  the  point.  During  the  twenty  years  1887-1906 
there  were  2107  deaths  from  explosions  of  firedamp  or  coal-dust 
in  the  coal-mines  of  the  United  Kingdom,  or  an  average  of  105 
deaths  per  annum.  From  §  12  of  Chap.  XIII.  it  follows  that  this 
should  be  the  square  of  the  standard-deviation  of  simple  sampling, 
or  the  standard-deviation  itself  approximately  10*3.  But  the 
square  of  the  actual  standard-deviation  is  7178,  or  its  value  84-7, 
the  numbers  of  deaths  ranging  between  14  (in  1903)  and  317 
(in  1894).  This  large  standard-deviation,  to  judge  from  the 
figures,  is  partly,  though  not  wholly,  due  to  a  general  tendency  to 
decrease  in  the  numbers  of  deaths  from  explosions  in  spite  of  a 
large  increase  in  the  number  of  persons  employed ;  but  even  if  we 
ignore  this,  the  magnitude  of  the  standard-deviation  can  be 
accounted  for  by  a  very  small  value  of  the  correlation  r,  expressive 
of  the  fact  that  if  an  explosion  is  sufficiently  serious  to  be  fatal  to 
one  individual,  it  will  probably  be  fatal  to  others  also.  For  if  o-q 
denote  the  standard-deviation  of  simple  sampling,  o-  the  standard- 
deviation  of  sampling  given  by  equation  (5),  we  have 


r  = 


Whence,  from  the  above  data,  taking  the  numbers  of  persons 
employed  underground  at  a  rough  average  of  560,000, 

=  +0-00012. 


560000  X  105 


15.  Summarising  the  preceding  paragraphs,  §§  9-14,  we  see 
that  if  the  chances  p  and  q  differ  for  the  various  universes, 
districts,  years,  materials,  or  whatever  they  may  be  from  which 
the  samples  are  drawn,  the  standard-deviation  observed  will  be 
greater  than  the  standard-deviation  of  simple  sampling,  as 
calculated  from  the  average  values  of  the  chances  :  if  the  average 
chances  are  the  same  for  each  universe  from  which  a  sample  is 
drawn,  but  vary  from  individual  to  individual  or  from  one  sub- 
class to  another  within  the  universe,  the  standard-deviation 
observed  will  be  less  than  the  standard-deviation  of  simple 
sampling  as  calculated  from  the  mean  values  of  the  chances : 
finally,  if  p  and  q  are  constant,  but  the  events  are  no  longer 
independent,  the  observed  standard-deviation  will  be  greater  or 
less  than  the  simplest  theoretical  value  according  as  the  corre- 
lation between  the  results  of  the  single  events  is  positive  or 
negative.  These  conclusions  further  emphasise  the  need  for 
caution  in  the  use  of  standard  errors.    If  we  find  that  the 
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standard-deviation  in  some  case  of  sampling  exceeds  the  standard- 
deviation  of  simple  sampling,  two  interpretations  are  possible : 
either  that  p  and  q  are  different  in  the  various  universes  from 
which  samples  have  been  drawn  {i.e.  that  the  variations  are 
more  or  less  definitely  significant  in  the  sense  of  §  13,  Chap.  XIII.), 
or  that  the  results  of  the  events  are  positively  correlated  inter 
se.  If  the  actual  standard-deviation  fall  short  of  the  standard- 
deviation  of  simple  sampling  two  interpretations  are  again 
possible,  either  that  the  chances  p  and  q  vary  for  different 
individuals  or  sub-classes  in  each  universe,  while  approximately 
constant  from  one  universe  to  another,  or  that  the  results  of 
the  events  are  negatively  correlated  inter  se.  Even  if  the 
actual  standard-deviation  approaches  closely  to  the  standard- 
deviation  of  simple  sampling,  it  is  only  a  conjectural  and  not 
a  necessary  inference  that  all  the  conditions  of  "  simple  sampling  " 
as  defined  in  §  8  of  the  last  chapter  are  fulfilled.  Possibly,  for 
example,  there  may  be  a  positive  correlation  r  between  the 
results  of  the  different  events,  masked  by  a  variation  of  the 
chances  p  and  q  in  sub-classes  of  each  universe. 

Sampling  which  fulfils  the  conditions  laid  down  in  §  8  of 
Chap.  XIII.,  simple  sampling  as  we  have  called  it,  is  generally 
spoken  of  as  random  sampling.  We  have  thought  it  better  to 
avoid  this  term,  as  the  condition  that  the  sampling  shall  be 
random — haphazard — is  not  the  only  condition  tacitly  assumed. 
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EXERCISES. 

1.  Referring  to  Question  7  of  Chap.  XIII.,  work  out  the  values  of  tho 
significant  standard-deviation  crp  (as  in  §  10)  for  each  row  or  group  of  rows 
there  given,  but  taking  row  5  with  rows  6  and  7. 
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2.  For  all  the  districts  in  England  and  Wales  included  in  tlie  same  table 
(Table  VI.,  Chap.  IX.)  the  standard-deviation  of  the  proportion  of  male  births 
per  1000  of  all  births  is  7  "46  and  the  mean  proportion  of  male  births  509  "2. 
The  harmonic  mean  number  of  births  in  a  district  is  5070.  Find  the  signi- 
ficant standard- deviation  ay. 

3.  If  for  one  half  of  n  events  the  chance  of  success  is  p  and  the  chance  of 
failure  whilst  for  the  other  half  the  chance  of  success  is  q  and  the  chance  of 
failure  what  is  the  standard -deviation  of  the  number  of  successes,  the  events 
being  all  independent? 

4.  The  following  are  the  deaths  from  small-pox  during  the  20  years 
1882-1901  in  England  and  Wales:  — 


1882 

1317 

1892 

431 

83 

957 

93 

1457 

84 

2234 

94 

820 

85 

2827 

95 

223 

86 

275 

96 

541 

87 

506 

97 

25 

88 

1026 

98 

253 

89 

23 

99 

174 

90 

16 

1900 

85 

91 

49 

1901 

356 

The  death-rate  from  small-pox  being  very  small,  the  rule  of  §  12,  Chap. 
XIII.,  may  be  applied  to  estimate  the  standard -deviation  of  simple  sampling. 
Assuming  that  the  excess  of  the  actual  standard-deviation  over  this  can  be 
entirely  accounted  for  by  a  correlation  between  the  results  of  exposure  to  risk 
of  the  individuals  composing  the  population,  estimate  r.  The  mean  population 
during  the  period  may  be  taken  in  round  numbers  as  29  millions. 


CHAPTER  XV. 


THE  BINOMIAL  DISTRIBUTION  AND  THE 
NORMAL  CURVE. 

1-2.  Determination  of  the  frequency-distribution  for  the  number  of  successes 
in  n  events  :  the  binomial  distribution — 3.  Dependence  of  the  form 
of  the  distribution  on  q  and  n — 4-5.  Graphical  and  meclianical 
methods  of  forming  representations  of  the  binomial  distribution — 
6.  Direct  calculation  of  the  mean  and  the  standard-deviation  from 
the  distribution— 7-8.  Necessity  of  deducing,  for  use  in  many 
practical  cases,  a  continuous  curve  giving  approximately,  for  large 
values  of  n,  the  terms  of  the  binomial  series— 9.  Deduction  of  the 
normal  curve  as  a  limit  to  the  symmetrical  binomial — 10-11.  The 
value  of  the  central  ordinate — 12.  Comparison  with  a  binomial  dis- 
tribution for  a  moderate  value  of  n — 13.  Outline  of  the  more  general 
conditions  from  which  tlie  curve  can  be  deduced  by  advanced  methods — 
14.  Fitting  the  curve  to  an  actual  series  of  observations — 1 5.  Difficulty 
of  a  complete  test  of  tit  by  elementary  methods— 16,  The  table  of  areas 
of  the  normal  curve  and  its  use — 17.  The  quartile  deviation  and  the 
"  probable  error" — 18.  Illustrations  of  the  application  of  the  normal 
curve  and  of  the  table  of  areas. 

1.  In  Chapters  XIII.  and  XIV.  the  standard-deviation  of  the 
number  of  successes  in  n  events  was  determined  for  the  several 
more  important  cases,  and  the  applications  of  the  results  indicated. 
For  the  simpler  cases  of  artificial  chance  it  is  possible,  however,  to 
go  much  further,  and  determine  not  merely  the  standard-deviation 
but  the  entire  frequency-distribution  of  the  number  of  "  successes." 
This  we  propose  to  do  for  the  case  of  "simple  sampling,"  in  which 
all  the  events  are  completely  independent,  and  the  chances  p  and 
q  the  same  for  each  event  and  constant  throughout  the  trials. 
The  case  corresponds  to  the  tossing  of  ideally  perfect  coins  (homo- 
geneous circular  discs),  or  the  throwing  of  ideally  perfect  dice 
(homogeneous  cubes). 

2.  If  we  deal  with  one  event  only,  we  expect  in  iV  trials,  JVq 
failures  and  successes.  Suppose  we  now  combine  with  the 
results  of  this  first  event  the  results  of  a  second.  The  two  events 
are  quite  independent,  and  therefore,  according  to  the  rule  of 
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independence,  of  the  Nq  failures  of  the  first  event  {N'q)q  will  be 
associated  (on  an  average)  with  failures  of  the  second  event,  and 
{Nq)p  with  successes  of  the  second  event  (c/.  row  2  of  the  scheme 
on  p.  292).  Similarly  of  the  Np  successful  first  events,  {]^^p)q  will 
be  associated  (on  an  average)  with  failures  of  the  second  event 
and  {N'p)p  with  successes.  In  trials  of  two  events  we  would 
therefore  expect  approximately  Nq^  cases  of  no  success,  2Npq 
cases  of  one  success  and  one  failure,  and  Np"^  cases  of  two  successes, 
as  in  row  3  of  the  scheme.  The  results  of  a  third  event  may  be 
combined  with  those  of  the  first  two  in  precisely  the  same  way. 
Of  the  Nq^  cases  in  which  both  the  first  two  events  failed,  {Nq'^)q 
will  be  associated  (on  an  average)  with  failure  of  the  third  also, 
{Nq^)p  with  success  of  the  third.  Of  the  2Npq  cases  of  one 
success  and  one  failure,  {2Npq)q  will  be  associated  with  failure 
of  the  third  event  and  {2Npq)p  with  success,  and  similarly  for 
the  Np"^  cases  in  which  both  the  first  two  events  succeeded.  The 
result  is  that  in  N  trials  of  three  events  we  should  expect  Nq^ 
cases  of  no  success,  3  Npq'^  cases  of  one  success,  3  Np'^q  cases  of  two 
successes,  and  Np^  cases  of  three  successes,  as  in  row  5  of  the 
scheme.  The  scheme  is  continued  for  the  results  of  a  fourth 
event,  and  it  is  evident  that  all  the  results  are  included  under  a 
very  simple  rule :  the  frequencies  of  0,  1,  2  ...  .  successes  are 
given 

for  one  event  by  the  binomial  expansion  of  N{q  +/?) 
for  two  events  „  „  N{q  -^-pY 

for  three  events  „  „  N(q  +pY 

for /ot^r  events  „  „  ^{'i'^pY 

and  so  on.  Quite  generally,  in  fact : — the  frequencies  ofO,  1,2.... 
successes  in  N  trials  of  n  events  are  given  by  the  successive  terms 
in  the  binomial  expansion  of  N{q  +  pY^  viz. — 

n{  r^n.t-^p+'!^.a"v-^'^^;^^Vy+  .■■■} 

This  is  the  first  theoretical  expression  that  we  have  obtained  for 
the  form  of  a  frequency-distribution. 

3.  The  general  form  of  the  distributions  given  by  such 
binomial  series  will  have  been  evident  from  the  experimental 
examples  given  in  Chapter  XIII.,  i.e.  they  are  distributions 
of  greater  or  less  asymmetry,  tailing  off  in  either  direction 
from  the  mode.  The  distribution  is,  however,  of  so  much 
importance  that  it  is  worth  while  considering  the  form  in 
greater  detail.  This  form  evidently  depends  (1)  on  the  values 
of  q  and  p,  (2)  on  the  value  of  the  exponent  n.  If  p  and  q 
are  equal,  evidently  the  distribution  must  be  symmetrical,  for 
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p  and  q  may  be  interchanged  without  altering  the  value  of 
any  term,  and  consequently  terms  equidistant  from  either 
end  of  the  series  are  equal.  If  p  and  q  are  unequal,  on  the 
other  hand,  the  distribution  is  asymmetrical,  and  the  more 
asymmetrical,  for  the  same  value  of  ti,  the  greater  the  inequality 
of  the  chances.  The  following  table  shows  the  calculated 
distributions  for  n  =  20  and  values  of  p,  proceeding  by  0.1, 
from  0.1  to  0.5.    When  j9  =  0.1,  cases  of  two  successes  are  the 


A.  — Terms  of  the  Binomial  Series  10,000  {q-\-p)'^  for  Values  of  p 
from  0*1  to  0'5.    {Figures  given  to  the  nearest  unit.) 


Number  of 
Successes, 

p  =  0-l 
9  =  0-9 

p  =  0-2 
q  =  0-8 

p  =  OS 
9  =  0-7 

;7  =  0-4 
9  =  0-6 

;?  =  0-5 

9  =  0  5 

0 
1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 

1216 
2702 
2852 
1901 
898 
319 
89 
20 
4 
1 

115 
576 
1369 
2054 
2182 
1746 
1091 
545 
222 
74 
20 
5 
1 

8 
68 
278 
716 
1304 
1789 
1916 
1643 
1144 
654 
308 
120 
39 
10 
2 

5 
31 
123 
350 
746 
1244 
1659 
1797 
1597 
1171 
710 
355 
146 
49 
13 
3 

— 

2 
11 
46 
148 
370 
739 
1201 
1602 
1762 
1602 
1-201 
739 
370 
148 
46 
11 
2 

19 
20 

most  frequent,  but  cases  of  one  success  almost  equally  frequent : 
even  nine  successes  may,  however,  occur  about  once  in  10,000 
trials.  As  p  is  increased,  the  position  of  the  maximum 
frequency  gradually  advances,  and  the  two  tails  of  the  distribution 
become  more  nearly  equal,  until  =  0.5,  when  the  distribution 
is  symmetrical.  Of  course,  if  the  table  were  continued,  the 
distribution  for  p=--0.6  would  be  similar  to  that  for  ^  =  0.6, 
but  reversed  end  for  end,  and  so  on.  Since  the  standard- 
deviation  is  (npqY  and  the  maximum  value  of  pq  is  given  by 
p  =  q,  the  symmetrical  distribution  has  the  greatest  dispersion. 
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If  p  =  q  the  effect  of  increasing  n  is  to  raise  the  mean  and 
increase  the  dispersion.  If  p  is  not  equal  to  however,  not 
only  does  an  increase  in  n  raise  the  mean  and  increase  the 
dispersion,  but  it  also  lessens  the  asymmetry ;  the  greater 
for  the  same  value  of  p  and  7,  the  less  the  asymmetry. 
Thus  if  we  compare  the  first  distribution  of  the  above  table 
with  that  given  by  /i=  100,  we  have  the  following  ; — 


B.  —  Terms  of  the  Binomial  Series  1 0, 000  (0  '9  +  0*1  f^.    (Figures  given 
to  the  nearest  unit. ) 


Number 

of 

Successes. 

Frequency. 

Number 
of 

Successes. 

Frequency. 

Number 
of 

Successes. 

Frequency. 

0 

8 

1148 

16 

193 

1 

3 

9 

1304 

17 

106 

2 

16 

10 

1319 

18 

54 

3 

59 

11 

1199 

19 

26 

4 

159 

12 

988 

20 

12 

5 

339 

13 

743 

21 

5 

6 

596 

14 

513 

22 

2 

7 

889 

15 

327 

23 

1 

The  maximum  frequencies  now  occur  for  9  and  10  successes, 
and  the  two  "  tails "  are  much  more  nearly  equal.  If,  on  the 
other  hand,  n  is  reduced  to  2,  the  distribution  is — 


Number  of  Successes„  Frequency. 

0  8100 

1  1800 

2  100 

and  the  maximum  frequency  is  at  one  end  of  the  range.  What- 
ever the  values  of  p  and  if  n  is  only  increased  sufficiently,  the 
distribution  may  be  treated  as  sensibly  symmetrical,  the  necessary 
condition  being  (we  state  this  without  proof)  that  p-q  shall  be 
small  compared  with  the  standard-deviation  Jnpq.  It  is  left 
to  the  student  to  calculate  as  an  exercise  the  theoretical  distribu- 
tions corresponding  to  the  experimental  results  cited  in  Chapter 
XIII.  (Question  1). 

4.  The  property  of  the  binomial  series  used  in  the  scheme  of 
§  2  for  deducing  the  series  with  exponent  n  from  that  with 
exponent  n-\  leads  to  two  interesting  methods — graphical  and 
mechanical  —  for  constructing  approximate  representations  of 
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binomial  distributions.  It  will  have  been  noted  that  any  one 
term — say  the  rth — in  one  series  is  obtained  by  taking  q  times  the 
rth  term  together  with  jt?  times  the  (r-l)th  term  of  the  preceding 
series.  Now  if  AP^  GR  (figure  46)  be  two  verticals,  and  a  third, 
BQy  be  erected  between  them,  cutting  PR  in  so  that 
AB  '.BC  ::q:p,  then 

BQ=p.AP  +  q.CR. 

(This  follows  at  once  on  joining  AR  and  considering  the  two 
segments  into  which  BQ  is  divided.)  Consider  then  some 
binomial,  say  for  the  case  =  J,  ?  =  f  •  Draw  a  series  of  verticals 
(the  heavy  verticals  of  fig.  47)  at  any  convenient  distance  apart 


on  a  horizontal  base  line,  and  erect  other  verticals  (the  lighter 
verticals)  dividing  the  distance  between  them  in  the  ratio  of 
q  :p,  viz.  3:1.  Next,  choosing  a  vertical  scale,  draw  the  binomial 
polygon  for  the  simplest  case  n=l  ;  in  the  diagram  JS^  has  been 
taken  =  4096,  and  the  polygon  is  abed,  ob  =  3072,  Ic  =  1024.  The 
polygons  for  higher  values  of  n  may  now  be  constructed  graphi- 
cally. Mark  the  points  where  ab,  be,  ed  respectively  cut  the 
intermediate  verticals  and  project  them  horizontally  to  the  right 
on  to  the  thick  verticals.  This  gives  the  polygon  ab'c'd'e  for 
7Z  =  2.  For  ob'  =  q.ob,  Ic  =p.ob  +  q.lc,  and  so  on.  Similarly,  if  the 
points  where  ab\  b'c,  etc.,  cut  the  intermediate  verticals  are 
projected  horizontally  on  to  the  thick  verticals,  we  have  the 
polygon  ab"c'd"e'f'  for  7i==3.    The  process  may  be  continued 
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indefinitely,  though  it  will  be  found  difficult  to  maintain 
high  degree  of  accuracy  after  the  first  few  constructions. 


any 


5.  The  mechanical  method  of  constructing  the  representation  of 
a  binomial  series  is  indicated  diagrammatically  by  fig.  48.  The 
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apparatus  consists  of  a  funnel  opening  into  a  space — say  a  J  inch  in 
depth  —  between  a  sheet  of  glass  and  a  back-board.  This  space  is 
broken  up  by  successive  rows  of  wedges  like  1,  2  3,  4  5  6,  etc.,  which 
will  divide  up  into  streams  any  granular  material  such  as  shot  or 
mustard  seed  which  is  poured  through  the  funnel  when  the 
apparatus  is  held  at  a  slope.  At  the  foot  these  wedges  are 
replaced  by  vertical  strips,  in  the  spaces  between  which  the 


material  can  collect.  Consider  the  stream  of  material  that 
comes  from  the  funnel  and  meets  the  wedge  1.  This  wedge  is 
set  so  as  to  throw  q  parts  of  the  stream  to  the  left  and  p  parts 
to  the  right  (of  the  observer).  The  wedges  2  and  3  are  set  so  as 
to  divide  the  resultant  streams  in  the  same  proportions.  Thus 
wedge  2  throws  g-^  parts  of  the  original  material  to  the  left  and 
qp  to  the  right,  wedge  3  throws  pq  parts  of  the  original  material 
to  the  left  and  p^  to  the  right.  The  streams  passing  these  wedges 
are  therefore  in  the  ratio  of  q'^ :  2qp  :  p^.  The  next  row  of  wedges 
is  again  set  so  as  to  divide  these  streams  in  the  same  proportions 
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as  before,  and  the  four  streams  that  result  will  bear  the  propor- 
tions :  Sq'^p  :  3qp^  :  p^.  The  final  set,  at  the  heads  of  the 
vertical  strips,  will  give  the  streams  proportions  :  4q^p  :  Qq^p"^ : 
iqp^  :  p\  and  these  streams  will  accumulate  between  the  strips 
and  give  a  representation  of  the  binomial  by  a  kind  of  histogram, 
as  shown.  Of  course  as  many  rows  of  wedges  may  be  provided 
as  may  be  desired. 

This  kind  of  apparatus  was  originally  devised  by  Sir  Francis 
Galton  (ref.  1)  in  a  form  that  gives  roughly  the  symmetrical 
binomial,  a  stream  of  shot  being  allowed  to  fall  through  rows  of 
nails,  and  the  resultant  streams  being  collected  in  partitioned 
spaces.  The  apparatus  was  generalised  by  Professor  Pearson, 
who  used  rows  of  wedges  fixed  to  movable  slides,  so  that  they 
could  be  adjusted  to  give  any  ratio  of  q  :p.    (Ref.  13.) 

6.  The  values  of  the  mean  and  standard-deviation  of  a  binomial 
distribution  may  be  found  from  the  terms  of  the  series  directly, 
as  well  as  by  the  method  of  Chap.  XIII.  (the  calculation  was 
in  fact  given  as  an  exercise  in  Question  8,  Chap.  VIL,  and 
Question  6,  Chap.  VIII.).  Arrange  the  terms  under  each  other 
as  in  col.  1  below,  and  treat  the  problem  as  if  it  were  an  arith- 
metical example,  taking  the  arbitrary  origin  at  0  successes:  as 
iV  is  a  factor  all  through,  it  may  be  omitted  for  convenience. 

(1) 

Frequency  /. 

n{n-l){n-2) 

1.2.3      ^  ' 


(2) 
Dev.  |. 
0 

(3) 

A' 

(4) 

1 

n.q^~'^p 

2 

1)5^-V 

3 

W(?l-l)(7l-2) 

L2 

3n(n-l)(n-2) 

1.2         ^  P 

The  sum  of  col.  1  is  of  course  unity,  i.e.  we  are  treating  N  as 
unity,  and  the  mean  is  therefore  given  by  the  sum  of  the  terms 
in  col.  (3).    But  this  sum  is 

np  {  +  (71  -  \)q--'p  -f-  (^-y(j^-^)^n-3y  +  I 

=  np{q  -\-pY~^  =  np. 


That  is,  the  mean  M  is  np,  as  by  the  method  of  Chap.  XIII. 
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The  square  of  the  standard-deviation  is  given  by  the  sum  of 
the  terms  in  col.  (4)  less  the  square  of  the  mean,  that  is, 

(r2  =  np|g"-^  +  2(7t-l)g"--;?  +  3^^~  •  •  •  •  |  -  ?i 

But  the  series  in  the  bracket  is  the  binomial  series  {q+pY~^ 
with  the  successive  terms  multiplied  by  1,  2,  3,  .  .  .  It  therefore 
gives  the  difference  of  the  mean  of  the  said  binomial  from-1, 
and  its  sum  is  therefore  (71-1)^  +  1.,  Therefore 

0-2  =  np{{n  -  \)p  +  1 }  -  'n?p'^ 
=  np  -  np'^  —  npq, 

7.  The  terms  of  the  binomial  series  thus  afford  a  means  of 
completely  describing  a  certain  class  of  frequency-distributions — 
i.e.  of  giving  not  merely  the  mean  and  standard-deviation  in 
each  case,  but  of  describing  the  whole  form  of  the  distribution. 
If  N  samples  of  n  cards  each  be  drawn  from  an  indefinitely  large 
record  of  cards  marked  with  A  or  a,  the  proportion  of  il-cards 
in  the  record  being  jo,  then  the  -  successive  terms  of  the  series 
JV(q-^p)"  give  the  frequencies  to  be  expected  in  the  long  run  of 
0,  1,  2,  .  .  .  ^ -cards  in  the  sample,  the  actual  frequencies  only 
deviating  from  these  by  errors  which  are  themselves  fluctuations 
of  sampling.  The  three  constants  p,  n,  therefore,  determine 
the  average  or  smoothed  form  of  the  distribution  to  which  actual 
distributions  will  more  or  less  closely  approximate. 

Considered,  however,  as  a  formula  which  may  be  generally 
useful  for  describing  frequency-distributions,  the  binomial  series 
suffers  from  a  serious  limitation,  viz.  that  it  only  applies  to  a 
strictly  discontinuous  distribution  like  that  of  the  number  of 
^-cards  drawn  from  a  record  containing  A's  and  a's,  or  the  number 
of  heads  thrown  in  tossing  a  coin.  The  question  arises  whether 
we  can  pass  from  this  discontinuous  formula  to  an  equation 
suitable  for  representing  a  continuous  distribution  of  frequency. 

8.  Such  an  equation  becomes,  indeed,  almost  a  necessity  for 
certain  cases  with  which  we  have  already  dealt.  Consider,  for 
example,  the  frequency-distribution  of  the  number  of  male  births 
in  batches  of  10,000  births,  the  mean  number  being,  say,  5100. 
The  distribution  will  be  given  by  the  terms  of  the  series 
(0"49 -f  0'51)i^^^^  and  the  standard-deviation  is,  in  round  numbers, 
50  births.  The  distribution  will  therefore  extend  to  some  150 
births  or  more  on  either  side  of  the  mean  number,  and  in  order 
to  obtain  it  we  should  have  to  calculate  some  300  terms  of  a 
binomial  series  with  an  exponent  of  10,000!  This  would  not 
only  be  practically  impossible  without  the  use  of  certain  methods 
of  approximation,  but  it  would  give  the  distribution  in  quite 
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unnecessary  detail :  as  a  matter  of  practice,  we  would  not  have 
compiled  a  frequency-distribution  by  single  male  births,  but 
would  certainly  have  grouped  our  observations,  taking  probably 
10  births  as  the  class-interval.  We  want,  therefore,  to  replace  the 
binomial  series  by  some  continuous  curve,  having  approximately 
the  same  ordinates,  the  curve  being  such  that  the  area  between 
any  two  ordinates  and  will  give  the  frequency  of  observations 
between  the  corresponding  values  of  the  variable     and  x^. 

9.  It  is  possible  to  find  such  a  continuous  limit  to  the  binomial 
series  for  any  values  of  j)  and  but  in  the  present  work  we  will 
confine  ourselves  to  the  simplest  case  in  which  p  =  5'=:0"5,  and  the 
binomial  is  symmetrical.    The  terms  of  the  series  are 

mr\ 1.2.3— ••••}• 

The  frequency  of  m  successes  is 

\n 

Nay, — r — 

and  the  frequency  of  m  + 1  successes  is  derived  from  this  by 
multiplying  it  by  {n  -  m)/{m-{- 1).  The  latter  frequency  is 
therefore  greater  than  the  former  so  long  as 

n - m>m+ I 
n-l 

or  m<    t)  ' 

Suppose,  for  simplicity,  that  n  is  even,  say  equal  to  2k ;  then  the 
frequency  of  k  successes  is  the  greatest,  and  its  value  is 

=  .       .       .       .  (1) 

The  polygon  tails  off  symmetrically  on  either  side  of  this  greatest 
ordinate.     Consider  the  frequency  of  ^  -f  ^  successes  ;  the  value  is 

...  (2) 

and  therefore 

y^^  {k){k-l)(k-2)    .  ,  .  .  (k-x+1) 
y,      (k-\-l){k  +  2){k-]-3)  ....  (k  +  x) 

(iJ)HXL;J^-i3-')__ 
"(■n)(-l)('*D--  -(-^X-*!) 


•    (3)  , 

I 
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Now  let  US  approximate  by  assuming,  as  suggested  in  §  8,  that 
k  is  very  large,  and  indeed  large  compared  with  so  that  {xjky 
may  be  neglected  compared  with  {xjk).  This  assumption  does 
not  involve  any  difficulty,  for  we  need  not  consider  values  of  x 
much  greater  than  three  times  the  standard-deviation  or  3  v/A:/2, 
and  the  ratio  of  this  to  k  is  3/  J  "Ik,  which  is  necessarily  small  if  k 
be  large.  On  this  assumption  we  may  apply  the  logarithmic 
series 

log.(l  +  8)  =  S-^  +  i'-|!+..  .. 

to  every  bracket  in  the  fraction  (3),  and  neglect  all  terms  beyond 
the  first.    To  this  degree  of  approximation, 

_     x(x  -  1)  X 

~      k  k 
- 

k' 


Therefore,  finally, 


A;  ~2a2    ,         ,         ,  .  (4) 


where,  in  the  last  expression,  the  constant  k  has  been  replaced  by 
the  standard-deviation  o-,  for  o-'^  =  ^/2. 

The  curve  represented  by  this  equation  is  symmetrical  about 
the  point  a:  =  0,  which  gives  the  greatest  ordinate  y  =  yQ.  Mean, 
median,  and  mode  therefore  coincide,  and  the  curve  is,  in  fact,  that 
drawn  in  fig.  5,  p.  89,  and  taken  as  the  ideal  form  of  the  symmetri- 
cal frequency-distribution  in  Chap.  VI.  The  curve  is  generally 
known  as  the  normal  curve  of  errors  or  of  frequency,  or  the  law 
of  error. 

10.  A  normal  curve  is  evidently  defined  completely  by  giving 
the  values  of  and  o-  and  assigning  the  origin  of  x.  If  we 
desire  to  make  a  normal  curve  fit  some  given  distribution  as  near 
as  may  be,  the  last  two  data  are  given  by  the  standard-deviation 
and  the  mean  respectively ;  the  value  of  will  be  given  by  the 
fact  that  the  areas  of  the  two  distributions,  or  the  numbers  of 
observations  which  these  areas  represent,  must  be  the  same. 

This  condition  does  not,  however,  lead  in  any  simple  and 
elementary  algebraic  way  to  an  expression  for  ^q,  though  such 
a  value  could  l>e  found  arithmetically  to  any  desired  degree 
of  approximation.    For  it  is  evident  that  (1)  any  alteration  in 
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produces  a  proportionate  alteration  in  the  area  of  the  curve, 
e.g.  doubling  doubles  every  ordinate  y^  and  therefore  doubles 
the  area :  (2)  any  alteration  in  o-  produces  a  proportionate 
alteration  in  the  area,  for  the  values  of  y^  are  the  same  for  the 
same  values  of  ic/o-,  and  therefore  doubling  a  doubles  the  distance 
of  every  ordinate  from  the  mean,  and  consequently  doubles  the 
area.  The  area  of  the  curve,  or  the  number  of  observations 
represented,  is  therefore  proportional  to  y^o-,  or  we  must  have 

N=axy(^(T 

where  a  is  a  numerical  constant.  The  value  of  a  may  be  found 
approximately  by  taking  y^  and  cr  both  equal  to  unity,  calculating 
the  values  of  the  ordinates  y^  for  equidistant  values  of  and 
taking  the  area,  or  number  of  observations  as  given  by  the 
sum  of  the  ordinates  multiplied  by  the  interval. 

11.  The  table  below  gives  the  values  of  y  for  values  of  x 
proceeding  by  fifths  of  a  unit ;  the  values  are,  of  course,  the  same 
for  positive  and  negative  values  of  x.  For  the  whole  curve  the 
sum  of  the  ordinates  will  be  found  to  be  12  53318,  the  interval 
being  0"2  units;  the  area  is  therefore,  approximately,  2*50664, 


Ordinates  of  the  Curve  y  =  e     .    {For  references  to  more  extended 
tables,  see  list  on  pp.  357-8.) 


X. 

y- 

Log  y. 

X.  ' 

2/. 

Log  y. 

0 

1-00000 

0 

2-6 

•03405 

2-53209 

0-2 

-98020 

1-99131 

2-8 

•01984 

2-29757 

0-4 

•92312 

1-96526 

3-0 

-01111 

2-04567 

0-6 

-83527 

1-92183 

3-2 

-00598 

3-77641 

0-8 

•72615 

1-86103 

3-4 

•00309 

3-48978 

1-0 

-60653 

1-78285 

3-6 

•00153 

3-18577 

1-2 

•48675 

1-68731 

3-8 

•00073 

4-86439 

1-4 

•37531 

1-57439 

4-0 

•00034 

4-52564 

1-6 

•27804 

1-44410 

4-2 

•00015 

4-16952 

1-8 

•19790 

1-29644 

4-4 

•00006 

5-79603 

2  0 

-13534 

1-13141 

4-6 

•00003 

5-40516 

2-2 

•08892 

2'94901 

4-8 

•00001 

6-99693 

2-4 

•05614 

2-74923 

50 

•00000 

6-57132 

and  this  is  the  approximate  value  of  a.  The  value  is  more  than 
sufficiently  accurate  for  practical  purposes,  for  the  exact  value 
is  \/27r  =  2  506627  ....  The  proof  of  this  value  cannot  be  given 
here,  but  it  may  be  deduced  from  an  important  approximate 
expression  for  the  factorials  of  large  numbers,  due  to  James 
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Stirling  (1730).  If  n  be  large,  we  have,  to  a  high  degree  of 
approximation, 

Applying  Stirling's  theorem  to  the  factorials  in  equation  (1)  we 
have 

•   •   •  • 

The  complete  expression  for  the  normal  curve  is  therefore 

y-^.'^'  ....  (6) 

The  exponent  may  be  written  x^\c^  where  c=  \/2.o-,  and  this  is 
the  origin  of  the  use  of  >j2y.(T  (the  "modulus")  as  a  measure 
of  dispersion,  of  1/  ^/2.o■  as  a  measure  of  "precision,"  and  of  2o-2 
as  "  thejuctuation  "  (c/.  Chap.  VIII.  §  13).  The  use  of  the  factor 
2  or  \/2  becomes  meaningless  if  the  distribution  be  not  normal. 

Another  rule  cited  in  Chap.  VIII.,  viz.  that  the  mean  deviation 
is  approximately  4/5  of  the  standard-deviation,  is  strictly  true 
for  the  normal  curve  only.  For  this  distribution  the  mean 
deviation  =  or  \/2/7r  =  0-79788  .  .  .  .  <r:  the  proof  cannot  be  given 
within  the  limitations  of  the  present  work.  The  rule  that  a 
range  of  6  times  the  standard-deviation  includes  the  great 
majority  of  the  observations  and  that  the  quartile  deviation  is 
about  2/3  of  the  standard-deviation  were  also  suggested  by  the 
properties  of  this  curve  (see  below  §§  16,  17). 

12.  In  the  proof  of  §  9  the  assumption  was  made  that  k  (the 
half  of  the  exponent  of  the  binomial)  was  very  large  compared 
with  X  (any  deviation  that  had  to  be  considered).  In  point 
of  fact,  however,  the  normal  curve  gives  the  terms  of  the 
symmetrical  binomial  surprisingly  closely  even  for  moderate 
values  of  n.  Thus  if  7i  =  64,  ^  =  32,  and  the  standard-deviation 
is  4.  Deviations  x  have  therefore  to  be  considered  up  to  +12 
or  more,  which  is  over  1  /3  of  k.  As  will  be  seen,  however,  from 
the  annexed  table,  the  ordinates  of  the  normal  curve  agree  with 
those  of  the  binomial  to  the  nearest  unit  (in  10,000  observations) 
up  to  a;=  ±15.  The  closeness  of  approximation  is  partly  due 
to  the  fact  that,  in  applying  the  logarithmic  series  to  the 
fraction  on  the  right  of  equation  (3),  the  terms  of  the  second 
order  in  expansions  of  corresponding  brackets  in  numerator  and 
denominator  cancel  each  other :  these  terms,  therefore,  do  not 
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accumulate,  but  only  the  terms  of  the  third  order.  There  is 
only  one  second-order  term  that  has  been  neglected,  viz.  that  due 
to  the  last  bracket  in  the  denominator.  Even  for  much  lower 
values  of  n  than  that  chosen  for  the  illustration — e.g.  10  or  12 
(c/.  Qu.  4  at  the  end  of  this  chapter) — the  normal  curve  still 
gives  a  very  fair  approximation. 

Table  showing  (1)  Ordinates  of  the  Binomial  Series  10,000  {\  +  \)^  and 

10,000  -  S 

(2)  Corresponding  Ordinates  of  the  Normal  Curve  y  =      —  e 


Term. 

Binomial 
Series. 

Normal 
Curve. 

Term. 

Binomial 
Series. 

Normal 
Curve. 

32 

993 

997 

24  and  40 

136 

135 

31  and  33 

963 

967 

23  41 

80 

79 

30  34 

878 

880 

22  42 

44 

44 

29   ,,  35 

753 

753 

21    ,,  43 

23 

23 

28   „  36 

606 

605 

20   ,,  44 

11 

11 

27   „  37 

459 

457 

19   „  45 

5 

5 

26   ,,  38 

326 

324 

18   ,,  46 

2 

2 

25   „  39 

217 

216 

17  „  47 

1 

1 

13.  But  if  the  normal  curve  were  limited  in  its  application  to 
distributions  which  were  certainly  of  binomial  type,  its  use  in 
practice  (apart  from  its  theoretical  applications  to  many  cases  of 
the  theory  of  sampling)  would  be  very  restricted.  As  suggested, 
however,  by  the  illustrations  given  in  Chap.  VI.,  a  certain,  though 
not  a  large,  number  of  distributions — more  particularly  among 
those  relating  to  measurements  on  man  and  other  animals — are 
approximately  of  normal  form,  even  although  such  distributions 
have  not  obviously  originated  in  the  same  way  as  a  binomial 
distribution.  Take,  for  example,  the  distribution  of  statures  in 
the  United  Kingdom  (Chap.  VI.,  Table  VI.).  The  mean  stature 
is  67*46  inches,  the  standard-deviation  2*57  inches  (the  values  are 
worked  out  in  the  illustrations  of  Chaps.  VII.  and  VIIL),  and  the 
number  of  observations  8585.  This  gives  ^^=1333,  and  all  the 
data  necessary  for  plotting  a  normal  curve  of  the  same  mean  and 
standard-deviation  (the  process  of  fitting  is  dealt  with  at  greater 
length  in  §  14  below).  The  two  distributions  are  shown  together 
in  fig.  49,  the  continuous  curve  being  the  normal  curve,  and  the 
small  circles  showing  the  observed  frequencies.  It  is  evident  that 
they  agree  very  closely.  Other  body  measurements,  e.g.  skull 
measurements,  etc.,  also  follow  the  normal  law  ;  it  also  applies  to 
certain  characters  in  plants  {e.g.  number  of  seeds  per  capsule  in 

20 
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Nelumbium,  Pearl,  American  Naturalist^  Nov.  1906).  The  question 
arises,  therefore,  why,  in  such  cases,  the  distribution  should  be 
approximately  normal,  a  form  of  distribution  which  we  have  only 
shown  to  arise  if  the  variable  is  the  sum  of  a  large  number  of 
elements,  each  of  which  can  take  the  values  0  and  1  (or  other  two 
constant  values),  these  values  occurring  independently,  and  with 
equal  frequency. 

In  the  first  place,  it  should  be  stated  that  the  conditions  of  the 
deduction  given  in  §  9  were  made  a  little  unnecessarily  restricted, 
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Fig.  49. — The  Distribution  of  Stature  for  Adult  Males  in  the  British  Isles 
(fig.  6,  p.  89),  fitted  with  a  Normal  Curve  :   to  avoid  confusing  the 
figure,  the  frequency-polygon  has  not  been  drawn  in,  the  tops  of  the 
ordinates  being  shown  by  small  circles. 

with  a  view  to  securing  simplicity  of  algebra.  The  deduction 
may  be  generalised,  whilst  retaining  the  same  type  of  proof,  by 
assuming  that  p  and  q  are  unequal  (provided  p-q  he  small 
compared  with  Jnpq^  cf.  §  3),  that  p  and  q  are  not  quite  the 
same  for  all  the  events,  that  all  the  events  are  not  quite  inde- 
pendent, or  that  n  is  not  large,  but  that  some  sort  of  continuous 
variation  is  possible  in  the  values  of  the  elementary  variables, 
these  being  no  longer  restricted  to  0  and  1,  or  two  other  discrete 
values.  {Cf.  the  deduction  given  by  Pearson  in  ref.  13.)  Pro- 
ceeding further  from  this  last  idea,  the  deduction  may  be  rendered 
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more  general  still,  without  introducing  the  conception  of  the 
binomial  at  all,  by  founding  the  curve  on  more  or  less  complex 
cases  of  the  theory  of  sampling  for  variables  instead  of  for  attri- 
butes. If  a  variable  is  the  sum  (or,  within  limits,  some  slightly 
more  complicated  function)  of  a  large  number  of  other  variables, 
then  the  distribution  of  the  compound  or  resultant  variable  is 
normal,  provided  that  the  elementary  variables  are  independent, 
or  nearly  so  {cf.  ref.  6).  The  forms  of  the  frequency-distribu- 
tions of  the  elementary  variables  affect  the  final  distribution  less 
and  less  as  their  number  is  increased  :  only  if  their  number  is 
moderate,  and  the  distributions  all  exhibit  a  comparatively  high 
degree  of  asymmetry  of  uniform  sign,  will  the  same  sign  of 
asymmetry  be  sensibly  evident  in  the  distribution  of  the  compound 
variable.  On  this  sort  of  hypothesis,  the  expectation  of  normality 
in  the  case  of  stature  may  be  based  on  the  fact  that  it  is  a  highly 
compound  character — depending  on  the  sizes  of  the  bones  of  the 
head,  the  vertebral  column,  and  the  legs,  the  thickness  of  the 
intervening  cartilage,  and  the  curvature  of  the  spine — the  elements 
of  which  it  is  composed  being  at  least  to  some  extent  independent, 
i.e.  by  no  means  perfectly  correlated  with  each  other,  and  their 
frequency-distributions  exhibiting  no  very  high  degree  of  asym- 
metry of  one  and  the  same  sign.  The  comparative  rarity  of 
normal  distributions  in  economic  statistics  is  probably  due  in  part 
to  the  fact  that  in  most  cases,  while  the  entire  causation  is 
certainly  complex,  relatively  few  causes  have  a  largely  predominant 
influence  (hence  also  the  frequent  occurrence  of  irregular 
distributions  in  this  field  of  work),  and  in  part  also  to  a  high 
degree  of  asymmetry  in  the  distributions  of  the  elements  on  which 
the  compound  variable  depends.  Errors  of  observation  may  in 
general  be  regarded  as  compounded  of  a  number  of  elements,  due 
to  various  causes,  and  it  was  in  this  connection  that  the  normal 
curve  was  first  deduced,  and  received  its  name  of  the  curve  of 
errors,  or  law  of  error. 

14.  If  it  be  desired  to  compare  some  actual  distribution 
with  the  normal  distribution,  the  two  distributions  should  be 
superposed  on  one  diagram,  as  in  fig.  49,  though,  of  course,  on 
a  much  larger  scale.  When  the  mean  and  standard-deviation 
of  the  actual  distribution  have  been  determined,  is  given  by 
equation  (5) ;  the  fit  will  probably  be  slightly  closer  if  the 
standard-deviation  is  adjusted  by  Sheppard's  correction  (Chap. 
XI.  §  4).  The  normal  curve  is  then  most  readily  drawn  by  plot- 
ting a  scale  showing  fifths  of  the  standard-deviation  along  the 
base  line  of  the  frequency  diagram,  taking  the  mean  as  origin, 
and  marking  over  these  points  the  ordinates  given  by  the  figures 
of  the  table  on  p.  303,  multiplied  in  each  case  by  y^.    The  curve 
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can  be  drawn  freehand,  or  by  aid  of  a  curve  ruler,  through  the 
tops  of  the  ordinates  so  determined.  The  logarithms  of  y  in  the 
table  on  p,  303  are  given  to  facilitate  the  multiplication.  The  only 
point  ia  which  the  student  is  likely  to  find  any  difficulty  is 
in  the  use  of  the  scales :  he  must  be  careful  to  remember 
that  the  standard-deviation  must  be  expressed  in  terms  of  the 
class-interval  as  a  imit  in  order  to  obtain  for  a  number  of 
observations  per  interval  comparable  with  the  frequencies  of  his 
table. 

The  process  may  be  varied  by  keeping  the  normal  curve 
drawn  to  one  scale,  and  redrawing  the  actual  distribution 
so  as  to  make  the  area,  mean,  and  standard-deviation  the 
same.  Thus  suppose  a  diagram  of  a  normal  curve  was  printed 
once  for  all  to  a  scale,  say,  of  =  5  inches,  <r  =  1  inch,  and 
it  were  required  to  fit  the  distribution  of  stature  to  it. 
Since  the  standard-deviation  is  2*57  inches  of  stature,  the 
scale  of  stature  is  1  inch  =  2*57  inch  of  stature,  or  0*389  inches 
=  1  inch  of  stature  ;  this  scale  must  be  drawn  on  the  base  of  the 
normal-curve  diagram,  being  so  placed  that  the  mean  falls 
at  67*46.  As  regards  the  scale  of  frequency-per-interval,  this 
is  given  by  the  fact  that  the  whole  area  of  the  polygon  showing 
the  actual  distribution  must  be  equal  to  the  area  of  the 
normal  curve,  that  is  5  \/27r=  12*53  square  inches.  If,  therefore, 
the  scale  required  is  n  observations  per  interval  to  the  inch, 
we  have,  the  number  of  observations  being  8585, 

-^  =  12-53, 
nx2*57  ' 

which  gives  n  =  266*6. 

Though  the  second  method  saves  curve  drawing,  the  first, 
on  the  whole,  involves  the  least  arithmetic  and  the  simplest 
plotting. 

15.  Any  plotting  of  a  diagram,  or  the  equivalent  arithmetical 
comparison  of  actual  frequencies  with  those  given  by  the 
fitted  normal  distribution,  affords,  of  course,  in  itself,  only  a 
rough  test,  of  a  practical  kind,  of  the  normality  of  the  given 
distribution.  The  question  whether  all  the  observed  differences 
between  actual  and  calculated  frequencies,  taken  together, 
may  have  arisen  merely  as  fluctuations  of  sampling,  so  that  the 
actual  distribution  may  be  regarded  as  strictly  normal,  neglecting 
such  errors,  is  a  question  of  a  kind  that  cannot  be  answered  in 
an  elementary  work  (c/.  ref.  22).  At  present  the  student  is  in 
a  position  to  compare  the  divergences  of  actual  from  calculated 
frequencies  with  fluctuations  of  sampling  in  the  case  of  single 
class-intervals,  or  single  groups  of  class-intervals  only.    If  the 
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expected  theoretical  frequency  in  a  certain  interval  is  /,  the 
standard  error  of  sampling  is  Jf(N  - /)/N  ;  and  if  the  divergence 
of  the  observed  from  the  theoretical  frequency  exceed  some 
three  times  this  standard  error,  the  divergence  is  unlikely  to 
have  occurred  as  a  mere  fluctuation  of  sampling. 

It  should  be  noted,  however,  that  the  ordinate  of  the  normal 
curve  at  the  middle  of  an  interval  does  not  give  accurately  the 
area  of  that  interval,  or  the  number  of  observations  within  it :  it 
would  only  do  so  if  the  curve  were  sensibly  straight.  To  deal 
strictly  with  problems  as  to  fluctuations  of  sampling  in  the 
frequencies  of  single  intervals  or  groups  of  intervals,  we  require, 
accordingly,  some  convenient  means  of  obtaining  the  nimiber  of 
observations,  in  a  given  normal  distribution,  lying  between  any 
two  values  of  the  variable. 

16.  If  an  ordinate  be  erected  at  a  distance  x/a  from  the  mean, 
in  a  normal  curve,  it  divides  the  whole  area  into  two  parts,  the 
ratio  of  which  is  evidently,  from  the  mode  of  construction  of  the 
curve,  independent  of  the  values  of  and  of  o-.  The  calculation 
of  these  fractions  of  area  for  given  values  of  x/a,  though  a  long 
and  tedious  matter,  can  thus  be  done  once  for  all,  and  a  table 
giving  the  results  is  useful  for  the  purpose  suggested  in  §  15  and 
in  many  other  ways.  References  to  complete  tables  are  cited  at 
the  end  of  this  work  (list  of  tables,  pp.  357-8),  the  short  table  below 
being  given  only  for  illustrative  purposes.  The  table  shows  the 
greater  fraction  of  the  area  lying  on  one  side  of  any  given  ordinate  ; 
e.g.  0'53983  of  the  whole  area  lies  on  one  side  of  an  ordinate  at 
0"lo-  from  the  mean,  and  0  46017  on  the  other  side.  It  will  be 
seen  that  an  ordinate  drawn  at  a  distance  from  the  mean  equal  to 
the  standard-deviation  cuts  off  some  16  per  cent,  of  the  whole 
area  on  one  side ;  some  68  per  cent,  of  the  area  will  therefore  be 
contained  between  ordinates  at  ±  a.  An  ordinate  at  twice  the 
standard-deviation  cuts  off"  only  2*3  per  cent.,  and  therefore  some 
95'4  per  cent,  of  the  whole  area  lies  within  a  range  of  ±  2(t.  At 
three  times  the  standard-deviation  the  fraction  of  area  cut  off*  is 
reduced  to  135  parts  in  100,000,  leaving  99-7  per  cent,  within  a 
range  of  ±  3o-.  This  is  the  basis  of  our  rough  rule  that  a  range 
of  6  times  the  standard-deviation  will  in  general  include  the 
great  bulk  of  the  observations  :  the  rule  is  founded  on,  and  is  only 
strictly  true  for,  the  normal  distribution.  For  other  forms  of 
distribution  it  need  not  hold  good,  though  experience  suggests 
that  it  more  often  holds  than  not.  The  binomial  distribution, 
especially  Up  and  q  be  unequal,  only  becomes  approximately  normal 
when  n  is  large,  and  this  limitation  must  be  remembered  in  applying 
the  table  given,  or  similar  more  complete  tables,  to  cases  in  which 
the  distribution  is  strictly  binomial. 
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Ta^lYj  showing  the  Greater  Fraction  of  the  Area  of  a  Normal  Curve  to  One 
Side  of  an  Ordinate  of  Abscissa  xja.  {For  references  to  more  extended 
tables,  see  list  on  pp.  357-8. ) 


Greater 

Greater 

x/o'. 

Fraction  of 

x/a-. 

Fraction  of 

Area. 

Area. 

0 

•50000 

2-1 

•98214 

01 

•53983 

2-2 

•98610 

0-2 

•57926 

2-3 

•98928 

0-3 

•61791 

2^4 

•99180 

0-4 

•65542 

2^5 

•99379 

0-5 

•69146 

2-6 

•99534 

0-6 

•72575 

2-7 

•99653 

0-7 

•75804 

2-8 

•99744 

0-8 

•78814 

2-9 

•99813 

0  9 

•81594 

3^0 

•99865 

ro 

•84134 

3^1 

•99903 

1  1 

"86433 

3^2 

'99931 

1-2 

•88493 

3-3 

•99952 

13 

•90320 

3-4 

•99966 

1-4 

•91924 

3^5 

•99977 

1-5 

•93319 

3-6 

•99984 

1-6 

94520 

3^7 

•99989 

1-7 

•95543 

3-8 

•99993 

1-8 

•96407 

3-9 

•99995 

1-9 

•97128 

4-0 

•99997 

2-0 

•97725 

4-1 

•99998 

17.  If  we  try  to  determine  the  qiiartile  deviation  in  terms  of 
the  standard-deviation  from  tlie  table,  we  see  that  it  h'es  between 
0*6  and  O  Ta:    Interpolating,  it  is  given  approximately  by 

More  exact  interpolation  gives  the  value  0-67448975(r.  This  result, 
again,  is  the  foundation  of  the  rough  rule  that  the  semi-inter- 
quartile range  is  usually  some  2/3  of  the  standard-deviation :  it  is 
strictly  true  for  the  normal  curve  only.  It  may  be  noted  that 
the  constant  0'67448975  ....  can  be  determined  by  processes  of 
interpolation  only,  and  cannot  be  expressed  exactly,  like  the 
mean  deviation,  in  terms  of  any  other  known  constant,  such 
as  TT. 

It  has  become  customary  to  use  0-674  ....  times  the  standard 
error  rather  than  the  standard  error  itself  as  a  measure  of  the 
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unreliability  of  observed  statistical  results,  and  the  term  probable 
error  is  given  to  this  quantity.  It  should  be  noted  that  the  word 
"probable"  is  hardly  used  in  its  usual  sense  in  this  connection: 
the  probable  error  is  merely  a  quantity  such  that  we  may  expect 
greater  and  less  errors  of  simple  sampling  with  about  equal 
frequency,  provided  always  that  the  distribution  of  errors  is 
normal.  On  the  whole,  the  use  of  the  "probable  error"  has  little 
advantage  compared  with  the  standard,  and  consequently  little  j 
stress  is  laid  on  it  in  the  present  work  ;  but  the  term  is  in  constant 
use,  and  the  student  must  be  familiar  with  it. 

It  is  true  that  the  "  probable  error  "  has  a  simpler  and  more  direct 
significance  than  the  standard  error,  but  this  advantage  is  lost  as 
soon  as  we  come  to  deal  with  multiples  of  the  probable  error. 
Further,  the  best  modern  tables  of  the  ordinates  and  area  of  the 
normal  curve  are  given  in  terms  of  the  standard-deviation  or 
standard  error,  not  in  terms  of  the  probable  error,  and  the  mul- 
tiplication of  the  former  by  0  6745,  to  obtain  the  probable  error,  j 
is  not  justified  unless  the  distribution  is  normal.  For  very  large 
samples  the  distribution  is  approximately  normal,  even  though  p 
and  q  are  unequal ;  but  this  is  not  so  for  small  samples,  such  as 
often  occur  in  practice.  In  the  case  of  small  samples  the  use  of 
the  "probable  error"  is  consequently  of  doubtful  value,  while  the 
standard  error  retains  its  significance  as  a  measure  of  dispersion.  \ 
The  "  probable  error,"  it  may  be  mentioned,  is  often  stated  after 
an  observed  proportion  with  the  ±  sign  before  it ;  a  percentage 
given  as  20*5  ±2  3  signifying  "  20'5  per  cent.,  with  a  probable 
error  of  2*3  per  cent." 

If  an  error  or  deviation  in,  say,  a  certain  proportion  j9  only  just 
exceed  the  probable  error,  it  is  as  likely  as  not  to  occur  in  simple 
sampling  :  if  it  exceed  twice  the  probable  error  (in  either  direction), 
it  is  likely  to  occur  as  a  deviation  of  simple  sampling  about  18 
times  in  100  trials — or  the  odds  are  about  4*6  to  1  against  its 
occurring  at  any  one  trial.  For  a  range  of  three  times  the  probable 
error  the  odds  are  about  22  to  1,  and  for  a  range  of  four  times  the 
probable  error  142  to  1.  Until  a  deviation  exceeds,  then,  4  times 
the  probable  error,  we  cannot  feel  any  great  confidence  that  it  is 
likely  to  be  "significant."  It  is  simpler  to  work  with  the  standard 
error  and  take  ±  3  times  the  standard  error  as  the  critical  range  : 
for  this  range  the  odds  are  about  370  to  1  against  such  a  devia- 
tion occurring  in  simple  sampling  at  any  one  trial. 

18.  The  following  are  a  few  miscellaneous  examples  of  the  use 
of  the  normal  curve  and  the  table  of  areas. 

Example  i. — A  hundred  coins  are  thrown  a  number  of  times. 
How  often  approximately  in  10,000  throws  may  (1)  exactly  65  ] 
heads,  (2)  65  heads  or  more,  be  expected  ?  \ 
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The  standard-deviation  is  n/O-5  x  0*5  x  100  =  5.  Taking  the 
distribution  as  normal,  7/^  =  797 -9. 

The  mean  number  of  heads  being  50,  65  -  50  =  3o-.  The 
frequency  of  a  deviation  of  So-  is  given  at  once  by  the  table  (p.  303) 
as  797-9  x  -0111  ...  .  =8-86,  or  nearly  9  throws  in  10,000.  A 
throw  of  65  heads  will  therefore  be  expected  about  9  times. 

The  frequency  of  throws  of  65  heads  or  more  is  given  by  the 
area  table  (p.  310),  but  a  little  caution  must  now  be  used,  owing 
to  the  discontinuity  of  the  distribution.  A  throw  of  65  heads  is 
equivalent  to  a  range  of  6 4 '5-65 '5  on  the  continuous  scale  of  the 
normal  curve,  the  division  between  64  and  65  coming  at  64*5. 
64-5-50=  +2-9o-,  and  a  deviation  of  +2'd.cr  or  more,  will  only 
occur,  as  given  by  the  table,  187  times  in  100,000  throws,  or,  say, 
19  times  in  10,000. 

Example  ii, — Taking  the  data  of  the  stature-distribution  of  fig. 
49  (mean  67 '4 6,  standard-deviation  2*57  in.),  what  proportion  of 
all  the  individuals  will  be  within  a  range  of  ±  1  inch  of  the 
meanl 

1  inch  =0  389fr.  Simple  interpolation  in  the  table  of  p.  310 
gives  0*65129  of  the  area  below  this  deviation,  or  a  more  extended 
table  the  more  accurate  value  0  65136.  Within  a  range  of 
±  0*389o-  the  fraction  of  the  whole  area  is  therefore  0*30272,  or  the 
statures  of  about  303  per  thousand  of  the  given  population  will  lie 
within  a  range  of  ±  1  inch  from  the  mean. 

Example  iii. — In  a  case  of  crossing  a  Mendelian  recessive  by  a 
heterozygote  the  expectation  of  recessive  offspring  is  50  per  cent. 
(1)  How  often  would  30  recessives  or  more  be  expected  amongst  50 
offspring  owing  simply  to  fluctuations  of  sampling  1  (2)  How  many 
offspring  would  have  to  be  obtained  in  order  to  reduce  the  probable 
error  to  1  per  cent.  1 

The  standard  error  of  the  percentage  of  recessives  for  50 
observations  is  50  \/l/50  =  7*07.  Thirty  recessi^^es  in  fifty  is 
a  deviation  of  5  from  the  mean,  or,  if  we  take  thirty  as  representing 
29*5  or  more,  4*5  from  the  mean ;  that  is,  0*636.0-.  A  positive 
deviation  of  this  amount  or  more  occurs  about  262  times  in  1000, 
so  that  30  recessives  or  more  would "  be  expected  in  more  than  a 
quarter  of  the  batches  of  50  offspring.  We  have  assumed 
normality  for  rather  a  small  value  of  n,  but  the  result  is  sufficiently 
accurate  for  practical  purposes. 

As  regards  the  second  part  of  the  question  we  are  to  have 

•6745  x50  JTpi  =  l, 

n  being  the  number  of  offspring.  This  gives  7i=1137  to  the 
nearest  unit. 
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Example  iv. — The  diagram  of  fig.  49  shows  that  the  number  of 
statures  recorded  in  the  group  "62  in.  and  less  than  63"  is 
markedly  less  than  the  theoretical  value.  Could  such  a  difference 
occur  owing  to  fluctuations  of  simple  sampling ;  and  if  so,  how 
often  might  it  happen  1 

The  actual  frequency  recorded  is  169.  To  obtain  the  theoreti- 
cal frequency  we  may  either  take  it  as  giA^en  roughly  by  the 
ordinate  in  the  centre  of  the  interval,  or,  better,  use  the  integral 
table.  Remembering  that  statures  were  only  recorded  to  the 
nearest  -J  in.,  the  true  limits  of  the  interval  are  61^|— 62}|,  or 
61-94-62-94,  mid-value  62-44.  This  is  a  deviation  from  the 
mean  (67 '46)  of  5 '02.  Calculating  the  ordinate  of  the  normal 
curve  directly  we  find  the  frequency  197'8.  This  is  certainly,  as 
is  evident  from  the  form  of  the  curve,  a  little  too  small.  The 
interval  actually  lies  between  deviations  of  4-52  in.  and  5  52 
in.,  that  is,  1-7590-  and  2  1480-.  The  corresponding  fractions  of 
area  are  0  96071  and  0*98418,  difference,  or  fraction  of  area 
between  the  two  ordinates,  0'02347.  Multiplying  this  by  the 
whole  number  of  observations  (8585)  we  have  the  theoretical 
frequency  201 '5. 

The  difference  of  theoretical  and  observed  frequencies  is  therefore 
32*5.  But  the  proportion  of  observations  which  should  fall  into 
the  given  class  is  0*023,  the  proportion  falling  into  other  classes 
0"977,  and  the  standard  error  of  the  class  frequency  is  accordingly 
n/0-023  X  0-977  x  8585  =  14-0.  As  the  actual  deviation  is  only 
2'32  times  this,  it  could  certainly  have  occurred  as  a  fluctuation  of 
sampling. 

The  question  how  often  it  might  have  occurred  can  only  be 
answered  if  we  assume  the  distribution  of  fluctuations  of  sampling 
to  be  approximately  normal.  It  is  true  that  p  and  q  are  very 
unequal,  but  then  n  is  very  large  (8585) — so  large  that  the 
difference  of  the  chances  is  fairly  small  compared  with  sjnpq 
(about  one-fifteenth).  Hence  we  may  take  the  distribution  of 
errors  as  roughly  normal  to  a  first  approximation,  though  a 
first  approximation  only.  The  tables  give  0*990  of  the  area 
below  a  deviation  of  2 -320-,  so  we  would  expect  an  equal  or 
greater  deficiency  to  occur  about  10  times  in  1000  trials,  or  once 
in  a  hundred. 
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EXERCISES. 

1.  Calculate  the  theoretical  distributions  for  the  three  experimental  cases 
(1),  (2),  and  (3)  cited  in  §  7  of  Chapter  XIII. 

2.  Show  that  if  np  be  a  whole  number,  the  mean  of  the  binomial  coincides 
with  the  greatest  term. 

3.  Show  that  if  two  symmetrical  binomial  distributions  of  degree  n  (and 
of  the  same  number  of  observations)  are  so  superposed  that  the  7'th  term  of 
the  one  coincides  with  the  (r-fl)th  term  of  the  other,  the  distribution 
formed  by  adding  superposed  terms  is  a  symmetrical  binomial  of  degree  n+1. 

[Note :  it  follows  that  if  two  normal  distributions  of  the  same  area  and 
standard-deviation  are  superposed  so  that  the  difference  between  the  means  is 
small  compared  with  the  standard-deviation,  the  compound  curve  is  very 
nearly  normal.] 

4.  Calculate  the  ordinates  of  the  binomial  1024  (0*5 -f  0'5)^°,  and  compare 
them  with  those  of  the  normal  curve. 

5.  Draw  a  diagram  showing  the  distribution  of  statures  of  Cambridge 
students  (Chap.  VI.,  Table  VII.),  and  a  normal  curve  of  the  same  area, 
mean,  and  standard-deviation  superposed  thereon. 

6.  Compare  the  values  of  the  semi-interquartile  range  for  the  stature 
distributions  of  male  adults  in  the  United  Kingdom  and  Cambridge  studentSj 
(1)  as  found  directly,  (2)  as  calculated  from  the  standard-deviation,  on  the 
assumption  that  the  distribution  is  normal. 
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7.  Taking  the  mean  stature  for  the  British  Isles  as  67*46  in.  (the  dis- 
tribution of  fig.  49),  the  mean  for  Cambridge  students  as  68*85  in.,  and  the 
common  standard-deviation  as  2*56  in,,  what  percentage  of  Cambridge  students 
exceed  the  British  mean  in  stature,  assuming  the  distribution  normal  1 

8.  As  stated  in  Chap.  XllI,,  Example  ii.,  certain  crosses  of  Pisum  sativum 
based  on  7125  seeds  gave  25*32  per  cent,  of  green  seeds  instead  of  the  theoretical 
proportion  25  per  cent.,  the  standard  error  being  0'51  per  cent.  In  what  per- 
centage of  experiments  based  on  the  same  number  of  seeds  might  an  equal  or 
greater  percentage  be  expected  to  occur  owing  to  fluctuations  of  sampling 
alone  ? 

9.  In  what  proportion  of  similar  experiments  based  on  (1)  100  seeds,  (2) 
1000  seeds,  might  {a)  30  per  cent,  or  more,  (&)  35  per  cent,  or  more,  of  green 
seeds,  be  expected  to  occur,  if  ever  ? 

10.  In  similar  experiments,  what  number  of  seeds  must  be  obtained  to 
make  the  "  probable  error  "  of  the  proportion  1  per  cent.  ? 

11.  If  skulls  are  classified  as  dolichocephalic  when  the  length-breadth 
index  is  under  75,  mesocephalic  when  the  same  index  lies  between  75  and  80, 
and  bj-achy cephalic  when  the  index  is  over  80,  find  ap])roximately  (assuming 
that  the  distribution  is  normal)  the  mean  and  standard-deviation  of  a  series 
in  which  58  per  cent,  are  stated  to  be  dolichocephalic,  38  per  cent,  meso- 
cephalic, and  4  per  cent,  brachycephalic. 


CHAPTER  XVI. 
NORMAL  CORRELATION. 

1-3.  Deduction  of  the  general  expression  for  the  normal  correlation  surface 
from  the  case  of  independence — 4.  Constancy  of  the  standard- 
deviations  of  parallel  arrays  and  linearity  of  the  regression— 5.  The 
contour  lines :  a  series  of  concentric  and  similar  ellipses — 6.  The 
normal  surface  for  two  correlated  variables  regarded  as  a  normal 
surface  for  uncorrelated  variables  rotated  with  respect  to  the  axes  of 
measurement :  arrays  taken  at  any  angle  across  the  surface  are  normal 
distributions  with  constant  standard-deviation  :  distribution  of  and 
correlation  between  linear  functions  of  two  normally  correlated 
variables  are  normal  :  principal  axes — 7.  Standard-deviations  round 
the  principal  axes — 8-11.  Investigation  of  Table  III.,  Chap.  IX.,  to 
test  normality  :  linearity  of  regression,  constancy  of  standard-deviation 
of  arrays,  normality  of  distribution  obtained  by  diagonal  addition, 
contour  lines— 12-13.  Isotropy  of  the  normal  distribution  for  two 
variables — 14.  Outline  of  the  principal  properties  of  the  normal  dis- 
tribution for  n  variables. 

1.  The  expression  that  we  have  obtained  for  the  "  normal "  dis- 
tribution of  a  single  variable  may  readily  be  made  to  yield  a 
corresponding  expression  for  the  distribution  of  frequency  of  pairs 
of  values  of  two  variables.  This  normal  distribution  for  two 
variables,  or  "normal  correlation  surface,"  is  of  great  historical 
importance,  as  the  earlier  work  on  correlation  is,  almost  with- 
out exception,  based  on  the  assumption  of  such  a  distribution ; 
though  when  it  was  recognised  that  the  properties  of  the  correla- 
tion-coefficient could  be  deduced,  as  in  Chap.  IX.,  without  reference 
to  the  form  of  the  distribution  of  frequency,  a  knowledge  of 
this  special  type  of  frequency-surface  ceased  to  be  so  essential. 
But  the  generalised  normal  law  is  of  importance  in  the  theory  of 
sampling  :  it  serves  to  describe  very  approximately  certain  actual 
distributions  {e.g.  of  measurements  on  man) ;  and  if  it  can  be 
assumed  to  hold  good,  some  of  the  expressions  in  the  theory  of 
correlation,  notably  the  standard-deviations  of  arrays  (and,  if 
more  than  two  variables  are  involved,  the  partial  correlation- 
coefficients),  can  be  assigned  more  simple  and  definite  meanings 
than  in  the  general  case.  The  student  should,  therefore,  be 
familiar  with  the  more  fundamental  properties  of  the  distribution. 
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2.  Consider  first  the  case  in  which  the  two  variables  are  com- 
pletely independent.  Let  the  distributions  of  frequency  for  the 
two  variables     and  x^,  singly,  be 


1 


2al 


(1)' 


Then,  assuming  independence,  the  frequency-distribution  of  pairs 
of  values  must,  by  the  rule  of  independence,  be  given  by 


yi2  =  yi2« 

where 


y^'-   N  -27r.a-,o-2  •         •  • 

Equation  (2)  gives  a  normal  correlation  surface  for  one  special 
case,  the  correlation-coefficient  being  zero.  If  we  put  x^  =  2,  con- 
stant, we  see  that  every  section  of  the  surface  by  a  vertical  plane 
parallel  to  the  axis,  i.e.  the  distribution  of  any  array  of  ^r^'s,  is 
a  normal  distribution,  with  the  same  mean  and  standard-deviation 
as  the  total  distribution  of  a^^'s,  and  a  similar  statement  holds  for 
the  array  of  a^'s ;  these  properties  must  hold  good,  of  course,  as 
the  two  variables  are  assumed  independent  (c/.  Chap.  V.  §  13). 
The  contour  lines  of  the  surface,  that  is  to  say,  lines  drawn  on 
the  surface  at  a  constant  height,  are  a  series  of  similar  ellipses 
with  major  and  minor  axes  parallel  to  the  axes  of  x^  and  a:^  and 
proportional  to  o-j  and  o-g,  the  equations  to  the  contour  lines  being 
of  the  general  form 

5  +  ^,  =  C^         ....  (4) 

Ol      0-2  ^  ' 

Pairs  of  values  of  x-^  and  x^^  related  by  an  equation  of  this  form 
are,  therefore,  equally  frequent. 

3.  To  pass  from  this  special  case  of  independence  to  the  general 
case  of  two  correlated  variables,  remember  (Chap.  XII.  §  8) 
that  if 

^1-2  —  ^\~  ^12'^2 
~  "^2^  ^21'"^! 

x^  and  X2.1,  as  also  and  x-^  o  ^^'^  uncorrected.  If  they  are  not 
merely  uncorrelated  but  completely  independent,  and  if  the  dis- 
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tribution  of  each  of  the  deviations  singly  be  normal,  we  must  have 
for  the  frequency-distribution  of  pairs  of  deviations  of     and  x^.^ 


But 


^9  ^  *'^1^2 


—  L  J  ±  —  9*.       ^  ^ 


9       '        9  19  * 

cri.2     o-.ii  0-1.2. 0-.2.1 


Evidently  we  would  also  have  arrived  at  precisely  the  same 
expression  if  we  had  taken  the  distribution  of  frequency  for 
and  a^ig)  ^^^^  reduced  the  exponent 

2         2  * 

We  have,  therefore,  the  general  expression  for  the  normal 
correlation  surface  for  two  variables 

-(4A-^'u^)       ■       .       .  (6) 

„  ^       ^<^1.2     <^2.1  1.2  2.1/ 

Further,  since  x^  and  ^^.2-^,  and  x^^^  are  independent,  we  must 
have 

27r.o-i.o-.,.i     27r.o-2.(ri.2     27r.o-i.o-,(l  -  r^*  ' 

4.  If  we  assign  to  x^  some  fixed  value,  say  /i2,  we  have  the 
distribution  of  the  array  of  x^'s  of  type  h^, 


^  0"1.2     0'2.1  1.2''2.l/ 


This  is  a  normal  distribution  of  standard-deviation  0-12,  with  a 
mean  deviating  by  ^^^^  the  mean  of  the  whole  distribu- 

tion of  ^j's.  As  ^2  represents  any  value  whatever  of  x^,,  we  see 
(1)  that  the  standard-deviations  of  all  arrays  of  x^  are  the  same, 
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and  equal  to  o-j.g :  (2)  that  the  regression  of  on  is  strictly 
linear.  Similarly,  of  course,  if  we  assign  to  ajj  any  value  Aj,  we 
will  find  (1)  that  the  standard-deviations  of  all  arrays  of  x^  are 
the  same :  (2)  that  the  regression  of  x^  on  x^  is  strictly  linear. 


y 


Fig.  50. — Principal  Axes  and  Contour  Lines  of  the  normal 
Correlation  Surface. 


5.  The  contour  lines  are,  as  in  the  case  of  independence,  a 
series  of  concentric  and  similar  ellipses  ;  the  major  and  minor 
axes  are,  however,  no  longer  parallel  to  the  axes  of  a^j  and  x^j  but 
make  a  certain  angle  with  them.  Fig.  50  illustrates  the  calcu- 
lated form  of  the  contour  lines  for  one  case,  RK  and  CC  being 
the  lines  of  regression.    As  each  line  of  regression  cuts  every 
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array  of  ar^  or  of  in  its  mean,  and  as  the  distribution  of  every 
array  is  symmetrical  about  its  mean,  RR  must  bisect  every 
horizontal  chord  and  CC  every  vertical  chord,  as  illustrated 
by  the  two  chords  shown  by  dotted  lines :  it  also  follows  that 
RR  cuts  all  the  ellipses  in  the  points  of  contact  of  the  horizontal 
tangents  to  the  ellipses,  and  CC  in  the  points  of  contact  of 
the  vertical  tangents.  The  surface  or  solid  itself,  somewhat 
truncated,  is  shown  in  fig.  29,  p.  166. 

6.  Since,  as  we  see  from  fig.  50,  a  normal  surface  for  two 
correlated  variables  may  be  regarded  merely  as  a  certain  surface 
for  which  r  is  zero  turned  round  through  some  angle,  and  since 
for  every  angle  through  which  it  is  turned  the  distributions  of  all 

arrays  and  arrays  are  normal,  it  follows  that  every  section 
of  a  normal  surface  by  a  vertical  plane  is  a  normal  curve,  i.e.  the 
distributions  of  arrays  taken  at  any  angle  across  the  surface  are 
normal.  It  also  follows  that,  since  the  total  distributions  of  x^ 
and  x^  must  be  normal  for  every  angle  though  which  the  surface 
is  turned,  the  distributions  of  totals  given  by  slices  or  arsays 
taken  at  any  angle  across  a  normal  surface  must  be  normal 
distributions.  But  these  would  give  the  distributions  of  functions 
like  a.^i±5.a?2,  and  consequently  (1)  the  distribution  of  any 
linear  function  of  two  normally  distributed  variables  x^  and  Xc^ 
must  also  be  normal ;  (2)  the  correlation  between  any  two  linear 
functions  of  two  normally  distributed  variables  must  be  norma) 
correlation. 

To  find  the  angle  B  through  which  the  surface  has  been  turned, 
from  the  position  for  which  the  correlation  is  zero  to  the  position 
for  which  the  coefficient  has  some  assigned  value  r,  we  must  use 
a  little  trigonometry.  The  major  and  minor  axes  of  the  ellipses 
are  sometimes  termed  the  principal  axes.    If  be  the  co- 

ordinates referred  to  the  principal  axes  (the  ^j-axis  being  the 
a?!  axis  in  its  new  position)  we  have  for  the  relation  between  ^j, 
^2,  aTj,  a?2,  the  angle  B  being  taken  as  positive  for  a  rotation  of 
the  a^j-axis  which  will  make  it,  if  continued  through  90"*,  coincide 
in  direction  and  sense  with  the  a^g-axis. 


^1  =  Xy  cos  B  +  ar^.  sin  B  \ 


(8) 


But,  since  uncorrelated,  ^(^^^2)  =  0.    Hence,  multiplying 

together  equations  (8)  and  summing, 

0  =  (o^  -  al)  sin  2^  +  2ri2. o-jo-a  cos  26 

tan  2^  =  ^^^   ....  (9) 

Oi  —  O2 

21 
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It  should  be  noticed  that  if  we  define  the  principal  axes  of  any 
distribution  for  two  variables  as  being  a  pair  of  axes  at  right 
angles  for  which  the  variables  ^j,  uncorrelated,  equation 

(9)  gives  the  angle  that  they  make  with  the  axes  of  measurement 
whether  the  distribution  be  normal  or  no. 

7.  The  two  standard-deviations,  say  2j  and  Sg,  about  the 
principal  axes  are  of  some  interest,  for  evidently  from  §  2  the 
major  and  minor  axes  of  the  contour-ellipses  are  proportional 
to  these  two  standard-deviations.  They  may  be  most  readily 
determined  as  follows.  Squaring  the  two  transformation  equations 
(8),  summing  and  adding,  we  have 

i:i  +  l.l  =  a\  +  cTl    ....  (10) 

Referring  the  surface  to  the  axes  of  measurement,  we  have  for 
the  central  ordinate  by  equation  (7) 

27ro-iO-2(l-r^2)»* 
Referring  it  to  the  principal  axes,  by  equation  (3) 

But  these  two  values  of  the  central  ordinate  must  be  equal, 
therefore 

2,2,=  <r,o-,(l-7-?,)»  (11) 

(10)  and  (11)  are  a  pair  of  simultaneous  equations  from  which 
2j  and  may  be  very  simply  obtained  in  any  arithmetical  case. 
Care  must,  however,  be  taken  to  give  the  correct  signs  to  the 
square  root  in  solving.  2j  +  is  necessarily  positive,  and  2^  -  2^ 
also  if  r  is  positive,  the  major  axes  of  the  ellipses  lying  along  : 
but  if  r  be  negative,  2j  -  ^2  is  also  negative.  It  should  be  noted 
that,  while  we  have  deduced  (11)  from  a  simple  consideration 
depending  on  the  normality  of  the  distribution,  it  is  really  of 
general  application  (like  equation  10),  and  may  be  obti\ined  at 
somewhat  greater  length  from  the  equations  for  transforming 
co-ordinates. 

8.  As  stated  in  Chap.  XV.  §  13,  the  frequency-distribution 
for  any  variable  may  be  expected  to  be  approximately  normal 
if  that  variable  may  be  regarded  as  the  sum  (or,  within  limits, 
some  slightly  more  complex  function)  of  a  large  number  of  other 
variables,  provided  that  these  elementary  component  variables 
are  independent,  or  nearly  so.  Similarly,  the  correlation  between 
two  variables  may  be  expected  to  be  approximately  normal  if 
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each  of  the  two  variables  may  be  regarded  as  the  sum,  or  some 
slightly  more  complex  function,  of  a  large  number  of  elementary 
component  variables,  the  intensity  of  correlation  depending  on 
the  proportion  of  the  components  common  to  the  two  variables. 

Stature  is  a  highly  compound  character  of  this  kind,  and  we 
have  seen  that,  in  one  instance  at  least,  the  distribution  of  stature 
for  a  number  of  adults  is  given  approximately  by  the  normal 
curve.  We  can  now  utilise  Table  III.,  Chap.  IX.,  p.  160,  showing 
the  correlation  between  stature  of  father  and  son,  to  test,  as  far 
as  we  can  by  elementary  methods,  whether  the  normal  surface 
will  fit  the  distribution  of  the  same  character  in  pairs  of  indi- 
viduals :  we  leave  it  to  the  student  to  test,  as  far  as  he  can  do  so 
by  simple  graphical  methods,  the  approximate  normality  of  the 
total  distributions  for  this  table.  The  first  important  property 
of  the  normal  distribution  is  the  linearity  of  the  regression. 
This  was  well  illustrated  in  fig.  37,  p.  174,  and  the  closeness  of 
the  regression  to  linearity  was  confirmed  by  the  values  of 
the  correlation-ratios  (p.  206),  viz.,  0  52  in  each  case  as  com- 
pared with  a  correlation  of  0'51.  Subject  to  some  investiga- 
tion as  to  the  possibility  of  the  deviations  that  do  occur 
arising  as  fluctuations  of  simple  sampling,  when  drawing 
samples  from  a  record  for  which  the  regression  is  strictly 
linear,  we  may  conclude  that  the  regression  is  appreciably 
linear. 

9.  The  second  important  property  of  the  normal  distribution 
for  two  variables  is  the  constancy  of  the  standard-deviation  for 
all  parallel  arrays.  We  gave  in  Chap.  X.  p.  204  the  standard- 
deviations  of  ten  of  the  columns  of  the  present  table,  from  the 
column  headed  62*5-63-5  onwards;  these  were — 

2-56  2-60 

2-11  2-26 

2-55  2-26 

2-24  2-45 

2-23  2-33 

the  mean  being  2-36.  The  standard-deviations  again  only  fluctuate 
irregularly  round  their  mean  value.  The  mean  of  the  first  five 
is  2-34,  of  the  second  five  2'38,  a  diff"erence  of  only  0'04  :  of  the 
first  group,  two  are  greater  and  three  are  less  than  the  mean, 
and  the  same  is  true  of  the  second  group.  There  does  not  seem 
to  be  any  indication  of  a  general  tendency  for  the  standard- 
deviation  to  increase  or  decrease  as  we  pass  from  one  end  of  the 
table  to  the  other.  We  are  not  yet  in  a  position  to  test  how 
far  the  differences  from  the  average  standard  deviation  might 
arise  in  sampling  from  a  record  in  which  the  distribution  was 
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strictly  normal,  but,  as  a  fact,  a  rough  test  suggests  that  they 
might  have  done  so. 

10.  Next  we  note  that  the  distributions  of  all  arrays  of  a 
normal  surface  should  themselves  be  normal.  Owing,  however, 
to  the  small  numbers  of  observations  in  any  array,  the  distributions 
of  arrays  are  very  irregular,  and  their  normality  cannot  be  tested 
in  any  very  satisfactory  way :  we  can  only  say  that  they  do  not 
exhibit  any  marked  or  regular  asymmetry.  But  we  can  test  the 
allied  property  of  a  normal  correlation-table,  viz.  that  the  totals 
of  arrays  must  give  a  normal  distribution  even  if  the  arrays  be 
taken  diagonally  across  the  surface,  and  not  parallel  to  either 
axis  of  measurement  (c/.  §  6).  From  an  ordinary  correlation- 
table  we  cannot  find  the  totals  of  such  diagonal  arrays  exactly, 
but  the  totals  of  arrays  at  an  angle  of  45"  will  be  given  with 
sufficient  accuracy  for  our  present  purpose  by  the  totals  of  lines 
of  diagonally  adjacent  compartments.  Referring  again  to  Table 
III.,  Chap.  IX.,  and  forming  the  totals  of  such  diagonals  (running 
up  from  left  to  right),  we  find,  starting  at  the  top  left-hand 
corner  of  the  table,  the  following  distribution  : — 


0-25 

78-75 

2 

81-25 

3.25 

66-5 

6-25 

59-25 

8 

42-25 

9-75 

30-75 

17 

29-25 

34-5 

19 

42 

10-75 

46-25 

7 

60-5 

4-25 

67-5 

3-5 

85-75 

1-75 

87-25 

1 

78 

0-25 

94-25 

Total  1078 


The  mean  of  this  distribution  is  at  0  359  of  an  interval  above  the 
centre  of  the  interval  with  frequency  78  :  its  standard-deviation 
is  4-757  intervals,  or,  remembering  that  the  interval  is  1/  of 
an  inch,  3-364  inches.  (This  value  may  be  checked  directly  from 
the  constants  for  the  table  given  in  Chap,  IX.,  Question  3,  p.  189, 
for  we  have  from  the  first  of  the  transformation  equations  (8), 

ai  =  a^.  cos'^  O  +  al  sin^  6  -f-  27'i2<J'iO'i'  sin    cos  0, 
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)d  inserting     —  2 


cr2  =  2-75,  r 


j2  =  0-51,  sin 


cos  ^=1/n/2 


find  o-^  =  3  361).  Drawing  a  diagram  and  fitting  a  normal 
curve  we  have  fig.  51  ;  the  distribution  is  rather  irregular  but  the 
fit  is  fair  ;  certainly  there  is  no  marked  asymmetry,  and,  so  far  as 
the  graphical  test  goes,  the  distribution  may  be  regarded  as 
appreciably  normal.  One  of  the  greatest  divergences  of  the 
actual  distribution  from  the  normal  curve  occurs  in  the  almost 
central  interval  with  frequency  78  :  the  difference  between  the 
observed  and  calculated  frequencies  is  here  12  units,  but  the 
standard  error  is  9'1,  so  that  it  may  well  have  occurred  as  a 
fluctuation  of  simple  sampling. 


100 


80 


60 


ft. 

i  zo 


\ 

\ 

Fig.  5L — Distribution  of  Frequency  obtained  by  addition  of  Table  III., 
Chap.  IX, ,  along  Diagonals  running  up  from  left  to  riglit,  fitted  with  a 
Normal  Curve. 


11.  So  far,  we  have  seen  (1)  that  the  regression  is  approxi- 
mately linear ;  (2)  that,  in  the  arrays  which  we  have  tested,  the 
standard-deviations  are  approximately  constant,  or  at  least  that 
their  differences  are  only  small,  irregular  and  fluctuating  ;  (3)  that 
the  distribution  of  totals  for  one  set  of  diagonal  arrays  is  approxi- 
mately normal.  These  results  suggest,  though  they  cannot 
completely  prove,  that  the  whole  distribution  of  frequency  may 
be  regarded  as  approximately  normal,  within  the  limits  of  fluctu- 
ations of  sampling.  We  may  therefore  apply  a  more  searching 
test,  viz.  the  form  of  the  contour  lines  and  the  closeness  of  their 
fit  to  the  contour-ellipses  of  the  normal  surface.  We  can  see  at 
once,  however,  that  no  very  close  fit  can  be  expected.  Since  the 
frequencies  in  the  compartments  of  the  table  are  small,  the 
standard  error  of  any  frequency  is  given  approximately  by  its 
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square  root  (Chap.  XIII.  §  12),  and  this  implies  a  standard  error 
of  about  5  units  at  the  centre  of  the  table,  3  units  for  a  frequency 
of  9,  or  2  units  for  a  frequency  of  4  :  such  fluctuations  might 
cause  wide  divergences  in  the  corresponding  contour  lines. 

Using  the  suffix  1  to  denote  the  constants  relating  to  the 
distribution  of  stature  for  fathers,  and  2  the  same  constants  for 
the  sons, 

i\"=  1078       i/^  =  67-70       J/g-  68-66 

0"!=  2-72        0-2=  2-75  ^2-^^^ 

Hence  we  have  from  equation  (7) 

y'i2  =  26-7 

and  the  complete  expression  for  the  fitted  normal  surface  is 

-2  _2 


/    2         2  \ 
'  \5'47    6-60    5-43  /* 

'le 


The  equation  to  any  contour  ellipse  will  be  given  by  equating 
the  index  of  «  to  a  constant,  but  it  is  very  much  easier  to  draw 
the  ellipses  if  we  refer  them  to  their  principal  axes.  To  do  this 
we  must  first  determine  ^,  2)^  and         From  (9), 

tan  26=  -46-49, 

whence  2^  =  91°  14',  ^  =  45°  37',  the  principal  axes  standing  very 
nearly  at  an  angle  of  45°  with  the  axes  of  measurement, 
owing  to  the  two  standard-deviations  being  very  nearly  equal. 
They  should  be  set  ofi'  on  the  diagram,  not  with  a  protractor,  but 
by  taking  tan  ^  from  the  tables  (1-022)  and  calculating  points  on 
each  axis  on  either  side  of  the  mean. 

To  obtain  2j  and  ^2  we  have  from  (10)  and  (11) 

22  +  21=14-961 
22^22=  12-868 

Adding  and  subtracting  these  equations  from  each  other  and 
taking  the  square  root, 

2^  +  22  =  5-275 
2,-22=1-447 

whence  2j  =  3-36,  22  =  1 '91;  owing  to  the  principal  axes  stand- 
ing nearly  at  45°  the  first  value  is  sensibly  the  same  as  that  found 
for  (T^  in  §  10,  The  equations  to  the  contour  ellipses,  referred  to 
the  principal  axes,  may  therefore  be  written  in  the  form 
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the  major  and  minor  axes  being  3 '36  x  c  and  1*91  x  c  respectively. 
To  find  c  for  any  assigned  vahie  of  the  frequency  y  we  have 

^^^2(log  yV,-log  y^^) 
log  e 

Supposing  that  we  desire  to  draw  the  three  contour-ellipses  for 
y  =  b,  10  and  20,  we  find  c=l-83,  1'40  and  0*76,  or  the  following 


Stature  of  Father  :  inches 


Fig.  52.  — Contour  Lines  for  the  Frequencies  5,  10  and  20  of  the  distribution 
of  Table  III.,  Chap.  IX.,  and  corresponding  Contour  Ellipses  of  the  fitted 
Normal  Surface.    P,  Pj,      P^,  principal  axes  :  M,  mean. 

values  for  the  major  and  minor  axes  of  the  ellipses  : — semi-major 
axes,  6-15,  4  70,  2-55:  semi-minor  axes,  3-50,  2-67,  1-45.  The 
ellipses  drawn  with  these  axes  are  shown  in  fig.  52,  very  much 
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reduced,  of  course,  from  the  original  drawing,  one  of  the  squares 
shown  representing  a  square  inch  on  the  original.  The  actual 
contour  lines  for  the  same  frequencies  are  shown  by  the  irregular 
polygons  superposed  on  the  ellipses,  the  points  on  these  polygons 
having  been  obtained  by  simple  graphical  interpolation  between 
the  frequencies  in  each  row  and  each  column — diagonal  interpola- 
tion between  the  frequencies  in  a  row  and  the  frequencies  in  a 
column  not  being  used.  It  will  be  seen  that  the  fit  of  the  two 
lower  contours  is,  on  the  whole,  fair,  especially  considering  the 
high  standard  errors.  In  the  case  of  the  central  contour,  y  =  20, 
the  fit  looks  very  poor  to  the  eye,  but  if  the  ellipse  be  compared 
carefully  with  the  table,  the  figures  suggest  that  here  again  we 
have  only  to  deal  with  the  effects  of  fluctuations  of  sampling. 
For  father's  stature  =  66  in.,  son's  stature  =  70  in.,  there  is 
a  frequency  of  18'75,  and  an  increase  in  this  much  less  than  the 
standard  error  would  bring  the  actual  contour  outside  the  ellipse. 
Again,  for  father's  stature  =  68  in.,  son's  stature  =  71  in.,  there 
is  a  frequency  of  19,  and  an  increase  of  a  single  unit  would  give 
a  point  on  the  actual  contour  below  the  ellipse.  Taking  the 
results  as  a  whole,  the  fit  must  be  regarded  as  quite  as  good  as 
we  could  expect  with  such  small  frequencies.  It  is  perhaps  of 
historical  interest  to  note  that  Sir  Francis  Galton,  working  with- 
out a  knowledge  of  the  theory  of  normal  correlation,  suggested 
that  the  contour  lines  of  a  similar  table  for  the  inheritance  of 
stature  seemed  to  be  closely  represented  by  a  series  of  concentric 
and  similar  ellipses  (ref.  2) :  the  suggestion  was  confirmed  when 
he  handed  the  problem,  in  abstract  terms,  to  a  mathematician, 
Mr  J.  D.  Hamilton  Dickson  (ref.  4),  asking  him  to  investigate 
''the  Surface  of  Frequency  of  Error  that  would  result  from 
these  data,  and  the  various  shapes  and  other  particulars  of  its 
sections  that  were  made  by  horizontal  planes"  (ref.  3,  p.  102). 

12.  The  normal  distribution  of  frequency  for  two  variables  is 
an  isotropic  distribution,  to  which  all  the  theorems  of  Chap.  V. 
§§  11-12  apply.  For  if  we  isolate  the  four  compartments  of  the 
correlation-table  common  to  the  rows  and  cohmins  centring 
round  values  of  the  variables  have  for  the  ratio 

of  the  cross-products  (frequency  of  x^^  multiplied  by  frequency 
of  aji,  a^a,  divided  by  frequency  of  x-^     multiplied  by  frequency  of 

Xi  X(^j 

'         (  a:i-a-i)(  X.2-X2)' 

Assuming  that  x'l  -     has  been  taken  of  the  same  sign  as  x^  ~ 
the  exponent  is  of  the  same  sign  as  ?'j2-    Hence  the  association  for 
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this  group  of  four  frequencies  is  also  of  the  same  sign  as  rj.-,,  the 
ratio  of  the  cross-products  being  unity,  or  the  association  zero, 
if  is  zero.  In  a  normal  distribution,  the  association  is  therefore 
of  the  same  sign — the  sign  of  7-^^ — for  every  tetrad  of  frequencies 
in  the  compartments  common  to  two  rows  and  two  columns  ;  that 
is  to  say,  the  distribution  is  isotropic.  It  follows  that  every 
grouping  of  a  normal  distribution  is  isotropic  whether  the  class- 
intervals  are  equal  or  unequal,  large  or  small,  and  the  sign  of  the 
association  for  a  normal  distribution  grouped  down  to  2-  x  2-fold 
form  must  always  be  the  same  whatever  the  axes  of  division 
chosen. 

These  theorems  are  of  importance  in  the  applications  of  the 
theory  of  normal  correlation  to  the  treatment  of  qualitative 
characters  which  are  subjected  to  a  manifold  classification.  The 
contingency  tables  for  such  characters  are  sometimes  regarded  as 
groupings  of  a  normal  distribution  of  frequency,  and  the  coefficient 
of  correlation  is  determined  on  this  hypothesis  by  a  rather  lengthy 
procedure  (ref.  14).  Before  applying  this  procedure  it  is  well, 
therefore,  to  see  whether  the  distribution  of  frequency  may  be 
regarded  as  approximately  isotropic,  or  reducible  to  isotropic  form 
by  some  alteration  in  the  order  of  rows  and  columns  (Chap.  V. 
§§  9-10).  If  only  reducible  to  isotropic  form  by  some  rearrange- 
ment, this  rearrangement  should  be  effected  before  grouping  the 
table  to  2-  x  2-fold  form  for  the  calculation  of  the  correlation 
coefficient  by  the  process  referred  to.  If  the  table  is  not  reducible 
to  isotropic  form  by  any  rearrangement,  the  process  of  calculating 
the  coefficient  of  correlation  on  the  assumption  of  normality  is  to 
be  avoided.  Clearly,  even  if  the  table  be  isotropic  it  need  not  be 
normal,  but  at  least  the  test  for  isotropy  affords  a  rapid  and 
simple  means  for  excluding  certain  distributions  which  are  not 
even  remotely  normal.  Table  II.  of  Chap.  V  might  possibly  be 
regarded  as  a  grouping  of  normally  distributed  frequency  if  re- 
arranged as  suggested  in  §  10  of  the  same  chapter — it  would  be 
worth  the  investigator's  while  to  proceed  further  and  compare 
the  actual  distribution  with  a  fitted  normal  distribution — but 
Table  IV.  could  not  be  regarded  as  normal,  and  could  not  be 
rearranged  so  as  to  give  a  grouping  of  normally  distributed 
frequency. 

13.  If  the  frequencies  in  a  contingency-table  be  not  large,  and 
also  if  the  contingency  or  correlation  be  small,  the  influence 
of  casual  irregularities  due  to  fluctuations  of  sampling  may 
render  it  difficult  to  say  whether  the  distribution  may  be  regarded 
as  essentially  isotropic  or  no.  In  such  cases  some  further  con- 
densation of  the  table  by  grouping  together  adjacent  rows  and 
columns,  or  some  process  of  "smoothing"  by  averaging  the 
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frequencies  in  adjacent  compartments,  may  be  of  service.    The  ' 

correlation- table  for  stature  in  father  and  son  (Table  III.,  Chap.  j 

IX.),  for  instance,  is  obviously  not  strictly  isotropic  as  it  stands :  i 

we  have  seen,  however,  that  it  appears  to  be  normal,  within  the  s 

limits  of  fluctuations  of  sampling,  and  it  should  consequently  be  "j 
isotropic  within  such  limits.    We  can  apply  a  rough  test  by 
regrouping  the  table  in  a  much  coarser  form,  say  with  four  rows 
and  four  columns :  the  table  below  exhibits  such  a  grouping,  the 
limits  of  rows  and  of  columns  having  been  so  fixed  as  to  include 

not  less  than  200  observations  in  each  array.  < 


Table  I. — (condensed  from  Table  III.  of  Chapter  IX.). 


Son's  Stature 
(inches). 

Father's  Stature  (inches). 

Under 
65 -.5. 

65-5-67-5 

67-5-69-5. 

69-5 
and  over. 

Total. 

Under  66-5 
66-5-68-5 
68-5-70-5 

70  5  and  over 

97-5 
76-5 
33-25 
14-75 

74-25 

108 
64-75 
32  5 

34-75 

85 
95 

80-75 

10-5 

52 
84-5 
134 

217 

321-5 

277-5 

262 

Total 

222 

279-5 

295-5 

'281 

1078 

Taking  the  ratio  of  the  frequency  in  col.  1  to  the  sum  of  the 
frequencies  in  cols.  1  and  2  for  each  successive  row,  and  so  on  for 
the  other  pnirs  of  columns,  we  find  the  following  series  of  ratios  : 


Table  II. — Eatio  of  Frequency  in  Column  m  to  Frequency  in  Column  in 
-\- Frequency  in  Column  (771  + 1)  in  Table  I. 


Row. 

Columns 

1  and  2. 

2  and  3. 

3  and  4. 

1 

0-568 

0-681 

0-768 

2 

0415 

0-560 

0-620 

3 

0  339 

0-405 

0-529 

4 

0-312 

0-287 

0-376 

These  ratios  decrease  continuously  as  we  pass  from  the  top  to  the 
bottom  of  the  table,  and  the  distribution,  as  condensed,  is  therefore 
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isotropic.  The  student  should  form  one  or  two  other  condensations 
of  the  original  table  to  3-  x  3-  or  4-  x  4-fold  form  :  he  will  probably 
find  them  either  isotropic,  or  diverging  so  slightly  from  isotropy 
that  an  alteration  of  the  frequencies,  well  within  the  margin  of 
possible  fluctuations  of  sampling,  will  render  the  distribution 
isotropic. 

14.  Before  concluding  this  chapter  we  may  note  briefly  some 
of  the  principal  properties  of  the  normal  distribution  of  frequency 
for  any  number  of  variables,  referring  the  student  for  proofs  to 
the  original  memoirs.  Denoting  the  frequency  of  the  combination 
of  deviations  x^,  ^s'  •  •  •  '  by  j^^2  .  .  .  .  m  we  must  have 
in  the  notation  of  Chapter  XII.,  if  the  uncorrelated  deviations  x-^, 
^21'  ^312'  be  completely  independent  (cf.  §  3  of  the  present 
chapter), 

2^12  .  .  .  .  n=  3^  12  .  . 


(12) 


where 


«2  /y,2  «2  ^2 
•^l   ,  ''_2A   ,  "^3.12   .  , 


'2.1 


^(x,x,,,,,x„)  =  "-\  +  -^  +  "~SJl+  +T (13) 

"  O-M.l  ....  (n-1) 

and       yi2....«  =  (270nTi:7:r^—  IT.  —.7  •  (1^) 


(n-l) 


The  expression  (13)  for  the  exponent  ^  may  be  reduced  to  a 
general  form  corresponding  to  that  given  for  two  variables,  viz. — 


2  '2  I 

0'l.23  .  .  .  .  n      0-2.13  .  .  .  .  n  ^«.12  .  .  .  (n-l) 


+   —    ....  (15) 

x^ 


2-^12.3...  n_  _  —  ...  —  2r(„_i)„       .  .  („_2)  . 

0^1.23  .  .  .  n0^2.l3  . . .  n  ^{n-  l).l  .  .  .  {n-2)n<^ n.l .  . .  (n-l) 

Several  important  results  may  be  deduced  directly  from  the  form 
(13)  for  the  exponent.  Clearly  this  might  have  been  written  in 
a  great  variety  of  ways,  commencing  with  any  deviation  of  the 
first  order,  allotting  any  primary  subscript  to  the  second  deviation 
(except  the  subscript  of  the  first),  and  so  on,  just  as  in  §  3  we 
arrived  at  precisely  the  same  final  form  for  the  exponent  whether 
we  started  with  the  two  deviations  x^  and  x^^  or  with  x^  and  x^  ^. 
Our  assumption,  then,  that  the  deviations  x^^  x^,^^  ^zu^ 
normally  distributed  amounts  to  the  assumption  that  all  devia- 
tions of  any  order  and  with  any  suffixes  are  normally  distributed, 
i.e.  in  the  general  normal  dist7'ihutio7i  for  n  variables  every  array 
of  every  order  is  a  normal  distribution.  It  will  also  follow,  gen- 
eralising the  deduction  of  ^  6,  that  any  linear  function  of  x^,  x^ 
....  a;^  is  normally  distributed.    Further,  if  in  (13)  any  fixed 
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values  be  assigned  to  the  following  deviations,  the 

correlation  between  and  x^,  on  expanding  Xr,,i,  is,  as  we  have 
seen,  normal  correlation.  Similarly,  if  any  fixed  values  be 
assigned  to  x^,  to  x^.-^^^s^  and  all  the  following  deviations,  on 
reducing  x^  -^2  second  order  we  shall  find  that  the  correla- 

tion between  X2,i  and  x^■^^  is  normal  correlation,  the  correlation 
coefficient  being  r^^^i,  and  so  on.  That  is  to  say,  using  k  to 
denote  any  group  of  secondary  suffixes,  (1)  the  correlation  between 
any  two  deviatiom  x^  j^  and  x^j,  is  normal  correlation ;  (2)  the  con-ela- 
tion between  the  said  deviations  is  r^^.k  whatever  the  particular 
fixed  values  assigned  to  the  remaining  deviations.  The  latter 
conclusion,  it  will  be  seen,  renders  the  meaning  of  partial 
correlation  coefficients  much  more  definite  in  the  case  of  normal 
correlation  than  in  the  general  case.  In  the  general  case  r^^.x 
represents  merely  the  average  correlation,  so  to  speak,  between 
x^ji  and  x^j,:  in  the  normal  case  r^^j^  is  constant  for  all  the  sub- 
groups corresponding  to  particular  assigned  values  of  the  other 
variables.  Thus  in  the  case  of  three  variables  which  are  normally 
correlated,  if  we  assign  any  given  value  to  rCg,  the  correlation 
between  the  associated  values  of  x-^  and  x^  is  r-^^.z  '•  t^^®  general 
case  rj2.3j  if  actually  worked  out  for  the  various  sub-groups 
corresponding,  say,  to  increasing  values  of  x^,  would  probably 
exhibit  some  continuous  change,  increasing  or  decreasing  as  the 
case  might  be.  Finally,  we  have  to  note  that  if,  in  the  expression 
(15)  for  we  assign  fixed  values,  say  h^,  h^,  etc.,  to  all  the 
deviations  except  x-^,  and  then  throw  <^  into  the  form  of  a  perfect 
square  (as  in  §  4  for  the  case  of  two  variables),  we  obtain  a  normal 
distribution  for  x^^  in  which  the  mean  is  displaced  by 

^1.23.  ..n^  0'l.23...fiT     ,  ^  ^123  ■ . .  n  , 

^12.34  .  .  .  n  _  "-2  +  M3.24  . . .  /Z-g  -h  .  .  .        j . . .  (n-l)^  "^n. 

0^2.13...  n  0^3.12...  n  .  .  .  {n-1) 

But  this  is  a  linear  function  of  h.j,  h^,  etc.,  therefore  in  the  case  of 
normal  correlation  the  regression  of  an?/  one  variable  on  any  or  all 
of  the  others  is  strictly  linear.     Tiie  expressions   r^2.M  .  .  . .  n 
<^i.23 ....  n/o-2.13 .  .  .  .  m  etc.  are  of  coarse  the  partial  regressions 

^12.34  .  .  .  .  n>  ®tC. 
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EXERCISES. 


1.  Deduce  equation  (11)  from  the  equations  for  transformation  of  co-ordinates 
without  assuming  the  normal  distribution.    (A  proof  will  be  found  in  ref.  10.) 

2.  Hence  show  that  if  the  pairs  of  observed  values  of  and  are  repre- 
sented by  points  on  a  plane,  and  a  straight  line  drawn  through  the  mean,  the 
sum  of  the  S(|uares  of  the  distances  of  the  points  from  this  line  is  a  minimum 
if  the  line  is  the  major  principal  axis. 

3.  The  coefficient  of  correlation  with  reference  to  the  principal  axes  being 
zero,  and  with  reference  to  other  axes  something,  there  must  be  some  pair  of 
axes  at  right  angles  for  which  the  correlation  is  a  maximum,  i.e.  is  numerically 
greatest  without  regard  to  sign.  Show  that  these  axes  make  an  angle  of  45° 
with  the  principal  axes,  and  that  the  maximum  value  of  the  correlation  is — 


4.  (Sheppard,  ref  12.)  A  fourfold  table  is  formed  from  a  normal  correla- 
tion table,  taking  the  points  of  division  between  A  and  a,  B  and  /3,  at  the 
medians,  so  that  {A)  =  {a)  =  {B)  =  {$)  =  ^/2.    Show  that 


CHAPTER  XVII. 


THE  SIMPLER  CASES  OF  SAMPLING  FOR  VARIABLES : 
PERCENTILES  AND  MEAN. 

1-2.  The  problem  of  sampling  for  variables ;  the  conditions  assumed— 
3.  Standard  error  of  a  percentile — 4.  Special  values  for  the  percentiles 
of  a  normal  distribution — 5.  Effect  of  the  form  of  the  distributi(m 
generally — 6.  Simplified  formula  for  the  case  of  a  grouped  frequency- 
distribution — 7.  Correlation  between  errors  in  two  percentiles  of  the 
same  distribution — 8.  Standard  error  of  the  interquartile  range  for  the 
normal  curve — 9.  Effect  of  removing  the  restrictions  of  simple  sampling, 
and  limitations  of  interpretation  — 10.  Standard  error  of  the  arithmetic 
mean — 11.  Relative  stability  of  mean  and  median  in  sampling — 12. 
Standard  error  of  the  difference  between  two  means — 13.  The  tendency 
to  normality  of  a  distribution  of  means — 14.  Effect  of  removing  the  re- 
strictions of  simple  sampling — 15.  Statement  of  the  standard  errors  of 
standard-deviation,  coefficient  of  variation,  correlation  coefficient  and 
regression,  correlation-ratio  and  criterion  for  linearity  of  regression — 16. 
Restatement  of  the  limitations  of  interpretation  if  the  sample  be  small. 

I.  In  Chapters  XIII. -XVI.  we  have  been  concerned  solely  with 
the  theory  of  sampling  for  the  case  of  attributes  and  the  frequency- 
distributions  appropriate  to  that  case.  We  now  proceed  to 
consider  some  of  the  simpler  theorems  for  the  case  of  variables 
[cf.  Chap.  XIII.  §  2).  Suppose  that  we  have  a  bag  containing  a 
practically  infinite  number  of  tickets  or  cards  bearing  the  recorded 
values  of  some  variable  JT,  and  that  we  draw  a  ticket  from  this 
bag,  note  the  value  that  it  bears,  draw  another,  and  so  on  until 
we  have  drawn  n  cards  (a  number  small  compared  with  the  whole 
number  in  the  bag).  Let  us  continue  this  process  until  we  have 
N  such  samples  of  n  cards  each,  and  then  work  out  the  mean, 
standard-deviation,  median,  etc.,  for  each  of  the  samples.  No  one 
of  these  measures  will  prove  to  be  absolutely  the  same  for  every 
sample,  and  our  problem  is  to  determine  the  standard-deviation 
that  each  such  measure  will  exhibit. 

2.  In  solving  this  problem,  we  must  be  careful  to  define 
precisely  the  conditions  which  are  assumed  to  subsist,  so  as  to 
realise  the  limitations  of  any  solution  obtained.    These  conditions 
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were  discussed  very  fully  for  the  ease  of  attributes  (Chap.  XIII. 
§  8),  and  we  would  refer  the  student  to  the  discussion  then  given. 
Here  it  is  sufficient  to  state  the  assumptions  briefly,  using  the 
letters  (a),  (6)  and  (c)  to  denote  the  corresponding  assumptions 
indicated  by  the  same  letters  in  the  section  cited. 

(a)  We  assume  that  we  are  drawing  from  precisely  the  same 
record  throughout  the  experiment,  so  that  the  chance  of  drawing 
a  card  with  any  given  value  of  X,  or  a  value  within  any  assigned 
limits,  is  the  same  at  each  sampling. 

(b)  We  assume  not  only  that  we  are  drawing  from  the  same 
record  throughout,  but  that  each  of  our  cards  at  each  drawing 
may  be  regarded  quite  strictly  as  drawn  from  the  same  record  (or 
from  identically  similar  records) :  e.g.  if  our  card-record  is  con- 
tained in  a  series  of  bundles,  we  must  not  make  it  a  practice  to 
take  the  first  card  from  bundle  number  1,  the  second  card  from 
bundle  number  2,  and  so  on,  or  else  the  chance  of  drawing  a 
card  with  a  given  value  of  X,  or  a  value  within  assigned  limits, 
may  not  be  the  same  for  each  individual  card  at  each  drawing. 

(c)  We  assume  that  the  drawing  of  each  card  is  entirely 
vidependent  of  that  of  every  other,  so  that  the  value  of  X  recorded 
>n  card  1,  at  each  drawing,  is  uncorrelated  with  the  value  of  X 
recorded  on  card  2,  3,  4,  and  so  on.  It  is  for  this  reason  that  we 
spoke  of  the  record,  in  §  1,  as  containing  a  practically  infinite 
number  of  cards,  for  otherwise  the  successive  drawings  at  each 
sampling  would  not  be  independent :  if  the  bag  contain  ten 
tickets  only,  bearing  the  numbers  1  to  10,  and  we  draw  the  card 
bearing  1,  the  average  of  the  following  cards  drawn  will  be  higher 
than  the  mean  of  all  cards  drawn  ;  if,  on  the  other  hand,  we  draw 
the  10,  the  average  of  the  following  cards  will  be  lower  than  the  mean 
of  all  cards — i.e.  there  will  be  a  negative  correlation  between  the 
number  on  the  card  taken  at  any  one  drawing  and  the  card  taken 
at  any  other  drawing.  Without  making  the  number  of  cards  in 
the  bag  indefinitely  large,  we  can,  as  already  pointed  out  for  the 
case  of  attributes  (Chap.  XIII.  §  3),  eliminate  this  correlation  by 
replacing  each  card  before  drawing  the  next. 

Sampling  conducted  under  these  conditions  we  shall,  as  before, 
speak  of  as  simple  sampling.  We  do  not,  it  should  be  noticed, 
make  the  further  assumption  that  the  sample  is  unbiassed,  i.e. 
that  the  chance  of  inclusion  in  the  sample  is  independent  of  the 
value  of  X  recorded  on  the  card  (cf.  the  last  paragraph  in  §  8, 
Chap.  XIII.,  and  the  discussion  in  §§  4-8,  Chap.  XIV.).  This 
assumption  is  unnecessary.  If  it  be  true,  the  interpretation  of 
our  results  becomes  simpler  and  more  straightforward,  for  we 
can  substitute  for  such  phrases  as  "  the  standard-deviation  of  X 
in  a  very  large  sample"  "  the  form  of  the  frequency -distribution 
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in  a  very  large  sample"  the  phrases  " the  standard-deviation  of 
X  in  the  original  record"  "  the  form  of  the  frequency-distribution 
in  the  original  record" :  but  in  very  many,  perhaps  the  majority 
of,  practical  cases  the  very  question  at  issue  is  the  nature  of  the 
relation  between  the  distribution  of  the  sample  and  the  distribu- 
tion of  the  record  from  which  it  is  drawn.  As  has  already  been 
emphasised  in  the  passages  to  which  reference  is  made  above,  no 
examination  of  samples  drawn  under  the  same  conditions  can 
give  any  evidence  on  this  head. 

3.  Standard  Error  of  a  Percentile. — Let  us  consider  first  the 
fluctuations  of  sampling  for  a  given  percentile,  as  the  problem  is 
intimately  related  to  that  of  Chaps.  XIIL-XIV. 

Let  Xp  be  a  value  of  X  such  that  pJV  of  the  values  of  X  in 
an  indefinitely  large  sample  drawn  under  the  same  conditions  lie 
above  it  and  giV  below  it. 

If  we  note  the  proportions  of  observations  above  Xp  in  samples 
of  n  drawn  from  the  record,  we  know  that  these  observed  values 
will  tend  to  centre  round  p  as  mean,  with  a  standard-deviation 
Jpq/n.  If  now  at  each  drawing,  as  well  as  observing  the  pro- 
portion of  X's  above  Xp,  say  p  +  S,  for  the  sample,  we  also  proceed 
to  note  the  adjustment  c  required  in  Xp  to  make  the  proportion 
of  observations  above  Xp  -f  c  in  the  sample  the  standard- 
deviation  of  €  will  bear  to  the  standard-deviation  of  8  the  same 
ratio  that  c  on  an  average  bears  to  8.  But  this  ratio  is  quite 
simply  determinable  if  the  number  of  observations  in  the  sample 
is  sufficiently  large  to  justify  us  in  assuming  that  8  is  small — so 
small  that  we  may  regard  the  element  of  the  frequency  curve 
(for  a  very  large  sample)  over  which  Xp  +  e  ranges  as  approximately 
a  rectangle.  If  this  assumption  be  made,  and  we  denote  the 
standard-deviation  of  X  in  a  very  large  sample  by  or,  and  the 
ordinate  of  the  frequency  curve  at  Xp  when  drawn  with  unit  area 
and  unit  standard-deviation  by  j/p, 

Therefore  for  the  standard-deviation  of  e  or  of  the  percentile 
corresponding  to  a  proportion^  we  have 

"^-^^y^"  •  •  •  •  0) 

4.  If  the  frequency-distribution  for  the  very  large  sample  be  a 
normal  curve,  the  values  of  for  the  principal  percentiles  may  be 
taken  from  the  published  tables.  A  table  calculated  by  Mr 
Sheppard  (Table  III.,  p.  9,  in  Tables  for  Statisticians  and  Biomet- 
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ricians,  or  Table  IV.,  ref.  16,  in  Appendix  I.)  gives  the  values 
directly,  and  these  have  been  utilised  for  the  following :  the 
student  can  estimate  the  values  roughly  by  a  combined  use  of  the 
area  and  ordinate  tables  for  the  normal  curve  given  in  Chapter 
XV.,  remembering  to  divide  the  ordinates  given  in  that  table  by 


J27r  so  as  to  make  the  area  uni 


Median  . 
Deciles  4 
„  3 

„  1 
Quartiles 


and  6 
and  7 
and  8 
and  9 


Value  of 

0-3989423 
0-3863425 
0-3476926 
0-2799619 
0-1754983 
0-3177766 


Inserting  these  values  of  7/^  in  equation  (1),  we  have  the 
following  values  for  the  standard  errors  of  the  median,  deciles, 
etc.,  and  the  values  given  in  the  second  column  for  their  probable 
errors  (Chap.  XV.  §  17),  which  the  student  may  sometimes  find 
useful : — 


Median 

Deciles  4  and  6 
„  3  and  7 
„  2  and  8 
„      1  and  9 

Quartiles 


Standard  error  is 
<r/V»  multiplied  by 

1-25331 
1-26804 
1-31800 
1-42877 
1-70942 
1-36263 


Probable  error  is 
tr/Vn  multiplied  by 

0-84535 
0-85528 
0-88897 
0-96369 
M5298 
0-91908 


It  will  be  seen  that  the  influence  of  fluctuations  of  sampling  on 
the  several  percentiles  increases  as  we  depart  from  the  median  : 
the  standard  error  of  the  quartiles  is  nearly  one-tenth  greater  than 
that  of  the  median,  and  the  standard  error  of  the  first  or  ninth 
deciles  more  than  one-third  greater. 

5.  Consider  further  the  influence  of  the  form  of  the  frequency- 
distribution  on  the  standard  error  of  the  median,  as  this  is  an 
important  form  of  average.  For  a  distribution  with  a  given 
number  of  observations  and  a  given  standard-deviation  the 
standard  error  varies  inversely  as  i/p.  Hence  for  a  distribution  in 
which  7/p  is  small,  for  example  a  U-shaped  distribution  like  that 
of  fig.  18  or  fig.  19,  the  standard  error  of  the  median  will  be 
relatively  high,  and  it  will,  in  so  far,  be  an  undesirable  form  of 
average  to  employ.  On  the  other  hand,  in  the  case  of  a  distribu- 
tion which  has  a  high  peak  in  the  centre,  so  as  to  exhibit  a  value 
of  1/p  large  compared  with  the  standard-deviation,  the  standard 
error  of  the  median  will  be  relatively  low.    We  can  create  such  a 
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"peaked"  distribution  by  superposing  a  normal  curve  with  a 
small  standard-deviation  on  a  normal  curve  with  the  same  mean 
and  a  relatively  large  standard-deviation.  To  give  some  idea  of 
the  reduction  in  the  standard  error  of  the  median  that  may  be 
effected  by  a  moderate  change  in  the  form  of  the  distribution,  let 
us  find  for  what  ratio  of  the  standard-deviations  of  two  such  curves, 
having  the  same  area,  the  standard  error  of  the  median  reduces  to 
^Isln^  where  o-  is  of  course  the  standard-deviation  of  the  com- 
pound distribution. 

Let  o-j,  be  the  standard-deviations  of  the  two  distributions, 
and  let  there  be  n/2  observations  in  each.  Then 


(«) 


On  the  other  hand,  the  value  of     is — 


Hence  the  standard  error  of  the  median  is 

'2^ 


1"2 


(c)  is  equal  to  alsfn  if 

2  JiraiO-^ 
Writing  0-2! =^  p,  that  is  if 

(i+p)7nv_^ 

2  Jirp 

or 

p4-|-2p3-H(2-47r)p2  +  2p  +  l=0. 
This  equation  may  be  reduced  to  a  quadratic  and  solved  by 
taking  p+  -^as  a  new  variable.    The  roots  found  give  /3  =  2*2360 

.  ...  or  0-4472  .  .  .  .,  the  one  root  being  merely  the  reciprocal  of 
the  other.  The  standard  error  of  the  median  will  therefore  be 
a/Jn,  in  such  a  compound  distribution,  if  the  standard-deviation 
of  the  one  normal  curve  is,  in  round  numbers,  about  2J  times 
that  of  the  other.  If  the  ratio  be  greater,  the  standard  error 
of   the  median    will   be   less    than   (x/Jn.     The  distribution 


340 


THEORY  OF  STATISTICS. 


for  which  the  standard  error  of  the  median  is  exactly  equal  to 
(rjijn  is  shown  in  fig.  53  :  it  will  be  seen  that  it  is  by  no  means 
a  very  striking  form  of  distribution ;  at  a  hasty  glance  it  might 
almost  be  taken  as  normal.  In  the  case  of  distributions  of  a  form 
more  or  less  similar  to  that  shown,  it  is  evident  that  we  cannot 
at  all  safely  estimate  by  eye  alone  the  relative  standard  error  of 
the  median  as  compared  with  a-jsln. 

6.  In  the  case  of  a  grouped  frequency-distribution,  if  the 
number  of  observations  is  sufficient  to  make  the  class-frequencies 
run  fairly  smoothly,  i.e.  to  enable  us  to  regard  the  distribution 


Fig.  53. 


as  nearly  that  of  a  very  large  sample,  the  standard  error  of  any 
percentile  can  be  calculated  very  readily  indeed,  for  we  can 
eliminate  a  from  equation  (1).  Let  fp  be  the  frequency-per- 
class-interval  at  the  given  percentile  — simple  interpolation  will 
give  us  the  value  with  quite  sufficient  accuracy  for  practical 
purposes,  and  if  the  figures  run  irregularly  the}^  may  be  smoothed. 
Let  a  be  the  value  of  the  standard-deviation  expressed  in  class- 
intervals,  and  let  n  be  the  number  of  observations  as  before. 
Then  since  is  the  ordinate  of  the  frequency-distribution  when 
drawn  with  unit  standard-deviation  and  unit  area,  we  must 
have 
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But  this  gives  at  once  for  the  standard  error  expressed  in  terms 
of  the  class-interval  as  unit 

_  /ox 

X"   •    •  • 

As  an  example  in  which  we  can  compare  the  results  given  bj 
the  two  different  formulte  (1)  and  (2),  take  the  distribution  of 
stature  used  as  an  illustration  in  Chaps.  VII.  and  VIII.  and  in 
§§  13,  14  of  Chap.  XV.  The  number  of  observations  is  8585, 
and  the  standard-deviation  2*57  in.,  the  distribution  being 
approximately  normal :  o-/^7i  =  0  027737,  and,  multiplying  by  the 
factor  1-253  ....  given  in  the  table  in  §  4,  this  gives  0  0348 
as  the  standard  error  of  the  median,  on  the  assumption  of 
normality  of  the  distribution.  Using  the  direct  method  of 
equation  (2),  we  find  the  median  to  be  67-47  (Chap.  VII.  §  15), 
which  is  very  nearly  at  the  centre  of  the  interval  with  a 
frequency  1329.  Taking  this  as  being,  with  sufficient  accuracy 
for  our  present  purpose,  the  frequency  per  interval  at  the  median, 
the  standard  error  is   

As  we  should  expect,  the  value  is  practically  the  same  as  that 
obtained  from  the  value  of  the  standard-deviation  on  the  assump- 
tion of  normality. 

Let  us  find  the  standard  error  of  the  first  and  ninth  deciles 
as  another  illustration.  On  the  assumption  that  the  distribu- 
tion is  normal,  these  standard  errors  are  the  same,  and  equal  to 
0-027737  X  1-70942  =  0  0474.  Using  the  direct  method,  we 
find  by  simple  interpolation  the  approximate  frequencies  per 
interval  at  the  first  and  ninth  deciles  respectively  to  be  590  and 
570,  giving  standard  errors  of  0*0471  and  0-0488,  mean  0  0479, 
slightly  in  excess  of  that  found  on  the  assumption  that  the  fre- 
quency is  given  by  the  normal  curve.  The  student  should  notice 
that  the  class-interval  is,  in  this  case,  identical  with  the  unit  of 
measurement,  and  consequently  the  answer  given  by  equation  (2) 
does  not  require  to  be  multiplied  by  the  magnitude  of  the 
interval. 

In  the  case  of  the  distribution  of  pauperism  (Chap.  VII., 
Example  i.),  the  fact  that  the  class-interval  is  not  a  unit  must 
be  remembered.  The  frequency  at  the  median  (3-195  per  cent.) 
is  approximately  96,  and  this  gives  for  the  standard  error  of  the 
median  by  (2)  (the  number  of  observations  being  632)  0  1 309 
intervals,  that  is  0-0655  per  cent. 

7.  In  6nding  the  standard  error  of  the  difference  between  two 
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percentiles  in  the  same  distribution,  the  student  must  be  care- 
ful to  note  that  the  errors  in  two  such  percentiles  are  not 
independent.  Consider  the  two  percentiles,  for  which  the  values 
of  p  and  q  are  q^,  p^  q^  respectively,  the  first-named  being  the 
lower  of  the  two  percentiles.  These  two  percentiles  divide  the 
whole  area  of  the  frequency  curve  into  three  parts,  the  areas  of 
which  are  proportional  to  q^^  1  -  ?!  -'Pi'>  ^'^^  Vi-  Further,  since 
the  errors  in  the  first  percentile  are  directly  proportional  to  the 
errors  in  q^,  and  the  errors  in  the  second  percentile  are  directly 
proportional  but  of  opposite  sign  to  the  errors  in  p^^  the  corre- 
lation between  errors  in  the  two  percentiles  will  be  the  same  as 
the  correlation  between  errors  in  q^  and  p^  but  of  opposite  sign. 
But  if  there  be  a  deficiency  of  observations  below  the  lower 
percentile,  producing  an  error  8j  in  q^,  the  missing  observations 
will  tend  to  be  spread  over  the  two  other  sections  of  the  curve 
in  proportion  to  their  respective  areas,  and  will  therefore  tend  t^) 
produce  an  error 


in  p^.    If  then  r  be  the  correlation  between  errors  in  q^  and 
ej  and  €2  their  respective  standard  errors,  we  have 

Or,  inserting  the  values  of  tlio  standard  errors, 

The  correlation  between  the  percentiles  is  the  same  in  magni- 
tude but  opposite  in  sign  :  it  is  obviously  jiositive,  and  consequently 

correlation  between  errors  1  _  ^_    / M\  /ov 
in  two  percentiles      J        V  q^^        '       •  w 

If  the  two  percentiles  approach  very  close  together,  and  q^^ 
p^  and  P2  become  sensibly  equal  to  one  another,  and  the  correla- 
tion becomes  unity,  as  we  should  expect. 

8.  Let  us  apply  the  above  value  of  the  correlation  between 
percentiles  to  find  the  standard  error  of  the  semi-interquartile 
range  for  the  normal  curve.  Inserting  q^  =P2  ^  -f ,  ^^^Pi^h  ^® 
find?*  =  J.  Hence  the  standard  error  of  the  interquartile  range 
is,  applying  the  ordinary  formula  for  the  standard-deviation  of  a 
difference,  2/^3  times  the  standard  error  of  either  quartile,  or 
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the  standard  error  of  the  sem^-^nterquartile  range  l//^3  times 
the  standard  error  of  a  quartile.  Taking  the  value  of  the 
standard  error  of  a  quartile  from  the  table  in  §  4,  we  have,  finally, 

standard  error  of  the  semi-  \  o- 

interquartile  range  in  a  I  =0*78672~7^  .  .  (4) 
normal  distribution  ) 

Of  course  the  standard-deviation  of  the  inter-quartile,  or  semi- 
interquartile,  range  can  readily  be  worked  out  in  any  particular 
case,  using  equation  (2)  and  the  value  of  the  correlation 
given  above :  it  is  best  to  work  out  such  standard  errors 
from  first  principles,  applying  the  usual  formula  for  the  standard 
deviation  of  the  difference  of  two  correlated  variables  (Chap.  XI. 
§  2,  equation  (1)). 

9.  If  there  is  any  failure  of  the  conditions  of  simple  sampling, 
the  formulae  of  the  preceding  sections  cease,  of  course,  to  hold 
good.  We  need  not,  however,  enter  again  into  a  discussion  of 
the  effect  of  removing  the  several  restrictions,  for  the  effect  on 
the  standard  error  of  p  was  considered  in  detail  in  §§  9-14  of 
Chap.  XIV.,  and  the  standard  error  of  any  percentile  is  directly 
proportional  to  the  standard  error  of^  (c/.  §  3).  Further,  the 
student  may  be  reminded  that  the  standard  error  of  any  per- 
centile measures  solely  the  fluctuations  that  may  be  expected  in 
that  percentile  owing  to  the  errors  of  simple  sampling  alone :  it 
has  no  bearing,  therefore,  save  on  the  one  question,  whether  an 
observed  divergence  of  the  percentile,  from  a  certain  value  that 
might  be  expected  to  be  yielded  by  a  more  extended  series  of 
observations  or  that  had  actually  been  observed  in  some  other 
series,  might  or  might  not  be  due  to  fluctuations  of  simple 
sampling  alone.  It  cannot  and  does  not  give  any  indication  of 
the  possibility  of  the  sample  being  biassed  or  unrepresentative  of 
the  material  from  which  it  has  been  drawn,  nor  can  it  give  any 
indication  of  the  magnitude  or  influence  of  definite  errors  of 
observation — errors  which  may  conceivably  be  of  greater  im- 
portance than  errors  of  sampling.  In  the  case  of  the  distribution 
of  statures,  for  instance,  the  standard  error  almost  certainly  gives 
quite  a  misleading  idea  as  to  the  accuracy  attained  in  determining 
the  average  stature  for  the  United  Kingdom  :  the  sample  is  not 
representative,  the  several  parts  of  the  kingdom  not  contributing 
in  their  true  proportions.  The  student  should  refer  again  to  the 
discussion  of  these  points  in  §§  4-8  of  Chap.  XIV.  Finally,  we 
may  note  that  the  standard  error  of  a  percentile  cannot  be 
evaluated  unless  the  number  of  observations  is  fairly  large — large 
enough  to  determine  fp  (eqn.  2)  with  reasonable  accuracy,  or 
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to  test  whether  we  may  treat  the  distribution  as  approximately 
normal  (c/.  also  §  16  below). 

(As  regards  the  theory  of  sampling  for  the  median  and  per- 
centiles generally,  cf.  ref.  15,  Laplace,  Supplement  IL  (standard 
error  of  the  median),  Edge  worth,  refs.  5,  6,  7,  and  Sheppard,  ref. 
27:  the  preceding  sections  have  been  based  on  the  work  of 
Edge  worth  and  Sheppard.) 

10.  Standard  Error  of  the  Arithmetic  J/eaw.— Let  us  now  pass 
to  a  fresh  problem,  and  determine  the  standard  error  of  the 
arithmetic  mean. 

This  is  very  readily  obtained.  Suppose  we  note  separately  at 
each  drawing  the  value  recorded  on  the  first,  second,  third  .... 
and  nth.  card  of  our  sample.  The  standard-deviation  of  the  values 
on  each  separate  card  will  tend  in  the  long  run  to  be  the  same, 
and  identical  with  the  standard-deviation  <r  of  in  an  indefinitely 
large  sample,  drawn  under  the  same  conditions.  Further,  the 
value  recorded  on  each  card  is  (as  we  assume)  uncorrelated  with 
that  on  every  other.  The  standard-deviation  of  the  sum  of  the 
v^alues  recorded  on  the  n  cards  is  therefore  Jn.a-^  and  the 
standard-deviation  of  the  mean  of  the  sample  is  consequently 
1/wth  of  this ;  or, 

This  is  a  most  important  and  frequently  cited  formula,  and  the 
student  should  note  that  it  has  been  obtained  without  any 
reference  to  the  size  of  the  sample  or. to  the  form  of  the  frequency- 
distribution.  It  is  therefore  of  perfectly  general  application,  if 
(T  be  known.  We  can  verify  it  against  our  formula  for  the 
standard-deviation  of  sampling  in  the  case  of  attributes.  The 
standard-deviation  of  the  number  of  successes  in  a  sample  of  vi 
observations  is  Jm.pq:  the  standard-deviation  of  the  total 
number  of  successes  in  n  samples  of  m  observations  each  is  there- 
fore Jnm.pq :  dividing  by  n  we  have  the  standard-deviation  of 
the  mean  number  of  successes  in  the  n  samples,  viz.  Jmpq  Jjn, 
agreeing  with  equation  (5). 

11.  For  a  normal  curve  the  standard  error  of  the  mean  is  to 
the  standard  error  of  the  median  approximately  as  100  to  125 
(cf.  §  4),  and  in  general  the  standard  errors  of  the  two  stand  in 
a  somewhat  similar  ratio  for  a  distribution  not  differing  largely 
from  the  normal  form.  For  the  distribution  of  statures  used  as 
an  illustration  in  §  6  the  standard  error  of  the  median  was  found 
to  be  0'0349  :  the  standard  error  of  the  mean  is  only  0'0277. 
The  distribution  being  very  approximately  normal,  the  ratio  of 
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the  two  standard  errors,  viz.  1'26,  assumes  almost  exactly  the  theo- 
retical magnitude.  In  the  case  of  the  asymmetrical  distribution  of 
rates  of  pauperism,  also  used  as  an  illustration  in  §  6,  the  standard 
error  of  the  median  was  found  to  be  0  0655  per  cent.  The 
standard  error  of  the  mean  is  only  0*0493  per  cent.,  which  bears 
to  the  standard  error  of  the  median  a  ratio  of  1  to  1*33.  As 
such  cases  as  these  seem  on  the  whole  to  be  the  more  common 
and  typical,  we  stated  in  Chap.  VII.  §  18  that  the  mean  is  in 
general  less  affected  than  the  median  by  errors  of  sampling.  At 
the  same  time  we  also  indicated  the  exceptional  cases  in  which 
the  median  might  be  the  more  stable — cases  in  which  the  mean 
might,  for  example,  be  affected  considerably  by  small  groups  of  I 
widely  outlying  observations,  or  in  which  the  frequency-distri- 
bution assumed  a  form  resembling  fig.  53,  but  even  more 
exaggerated  as  regards  the  height  of  the  central  "peak"  and  the 
relative  length  of  the  "tails."  Such  distributions  are  not  un- 
common in  some  economic  statistics,  and  they  might  be  expected 
to  characterise  some  forms  of  experimental  error.  If,  in  these 
cases,  the  greater  stability  of  the  median  is  sufficiently  marked 
to  outweigh  its  disadvantages  in  other  respects,  the  median 
may  be  the  better  form  of  average  to  use.  Fig.  53  represents 
a  distribution  in  which  the  standard  errors  of  the  mean  and  of  the 
median  are  the  same.  Further,  in  some  experimental  cases  it  is 
conceivable  that  the  median  may  be  less  affected  by  definite 
experimental  errors,  the  average  of  which  does  not  tend  to  be 
zero,  than  is  the  mean, — this  is,  of  course,  a  point  quite  distinct 
from  that  of  errors  of  sampling, 

12.  If  two  quite  independent  samples  of  n-^  and  Tig  observations 
respectively  be  drawn  from  a  record,  evidently  Cjg,  the  standard 
error  of  the  difierence  of  their  means  is  given  by 


If  an  observed  difference  exceed  three  times  the  value  of 
given  by  this  formula  it  can  hardly  be  ascribed  to  fluctuations 
of  sampling.  If,  in  a  practical  case,  the  value  of  o-  is  not  known 
a  priori,  we  must  substitute  an  observed  value,  and  it  would  seem 
natural  to  take  as  this  value  the  standard-deviation  in  the  two 
samples  thrown  together.  If,  however,  the  standard-deviations 
of  the  two  samples  themselves  differ  more  than  can  be  accounted 
for  on  the  basis  of  fluctuations  of  sampling  alone  (see  below,  §  15), 
we  evidently  cannot  assume  that  both  samples  have  been  drawn 
from  the  same  record :  the  one  sample  must  have  been  drawn 
from  a  record  or  a  universe  exhibiting  a  greater  standard-deviation 
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tlian  the  other.  If  two  samples  be  drawn  quite  independently 
from  different  universes,  indefinitely  large  samples  from  which 
exhibit  the  standard-deviations  o-^  and  o-g,  the  standard  error  of 
the  difference  of  their  means  will  be  given  by 

•  •  •  •  (7) 

Tig 

This  is,  indeed,  the  formula  usually  employed  for  testing  the 
significance  of  the  difference  between  two  means  in  any  case  : 
seeing  that  the  standard  error  of  the  mean  depends  on  the 
standard-deviation  only,  and  not  on  the  mean,  of  the  distribution, 
we  can  inquire  whether  the  two  universes  from  which  samples 
have  been  drawn  differ  in  mean  apart  from  any  difference  in 
dispersion. 

If  two  quite  independent  samples  be  drawn  from  the  same 
universe,  but  instead  of  comparing  the  mean  of  the  one  with  the 
mean  of  the  other  we  compare  the  mean  of  the  first  with  the 
mean  hIq  of  both  samples  together,  the  use  of  (6)  or  (7)  is  not 
justified,  for  errors  in  the  mean  of  the  one  sample  are  correlated 
with  errors  in  the  mean  of  the  two  together.  Following  precisely 
the  lines  of  the  similar  problem  in  §  13,  Chap.  XIII.,  case  III.,  we 
find  that  this  correlation  is  ijn^l{n^  -\-  n^),  and  hence 

•      •      •  (8) 


-\-  712) 


(For  a  complete  treatment  of  this  problem  in  the  case  of  samples 
drawn  from  two  different  universes  cf.  ref.  22.) 

13.  The  distribution  of  means  of  samples  drawn  under  the 
conditions  of  simple  sampling  will  always  be  more  symmetrical 
than  the  distribution  of  the  original  record,  and  the  symmetry 
will  be  the  greater  the  greater  the  number  of  observations  in  the 
sample.  Further,  the  distribution  of  means  (and  therefore  also  of 
the  differences  between  means)  tends  to  become  not  merely  sym- 
metrical but  normal.  We  can  only  illustrate,  not  prove,  the 
point  here  ;  but  if  the  student  will  refer  to§  13,  Chap.  XV.,  he  will 
see  that  the  genesis  of  the  normal  curve  in  this  case  is  in  accord- 
ance with  what  we  then  stated,  viz.  that  the  distribution  tends  to 
be  normal  whenever  the  variable  may  be  regarded  as  the  sum 
(or  some  slightly  more  complex  function)  of  a  number  of  other 
variables.  In  the  present  instance  this  condition  is  strictly  ful- 
filled. The  mean  of  the  sample  of  n  observations  is  the  sum  of 
the  values  in  the  sample  each  divided  by  n,  and  we  should  expect 
the  distribution  to  be  the  more  nearly  normal  the  larger  n.  As 
an  illustration  of  the  approach  to  symmetry  even  for  small  values 
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of  ri,  we  may  take  the  following  case.  If  the  student  will  turn  to 
the  calculated  binomials,  given  as  illustrations  of  the  forms  of 
binomial  distributions  in  Chap.  XV.  §  3,  he  will  find  there  the 
distribution  of  the  number  of  successes  for  twenty  events  when 
q  =  0  9,  p  =  O  l  :  the  distribution  is  extremely  skew,  starting  at 
zero,  rising  to  high  frequencies  for  1  and  2  successes,  and  thence 
tailing  off  to  20  cases  of  7  successes  in  10,000  throws,  4  cases  of  8 
successes  and  1  case  of  9  successes.  But  now  find  the  distribu- 
tion for  the  mean  number  of  successes  in  groups  of  five  throws, 
under  the  same  conditions.  This  will  be  equivalent  to  finding 
the  distribution  of  the  number  of  successes  for  100  such  events, 
and  then  dividing  the  observed  number  of  successes  by  five — the 
last  process  making  no  difference  to  the  form  of  the  distribution, 
but  only  to  its  scale.  But  the  distribution  of  the  number  of 
successes  for  100  events  when  5- =  0  9,  jt)  =  0"l,  is  also  given  in 
Chap.  XV.  §  3,  and  it  will  be  seen  that,  while  it  is  appreciably 
asymmetrical,  the  divergence  from  symmetry  is  comparatively 
small :  the  distribution  has  gained  very  greatly  in  symmetry 
though  only  five  observations  have  been  taken  to  the  sample. 
We  may  therefore  reasonably  assume,  if  our  sample  is  large, 
that  the  distribution  of  means  is  approximately  a  normal  dis- 
tribution, and  we  may  calculate,  on  that  assumption,  the  fre- 
quency with  which  any  given  deviation  from  a  theoretical  value 
or  a  value  observed  in  some  other  series,  in  an  observed  mean,  will 
arise  from  fluctuations  of  simple  sampling  alone. 

The  warning  is  necessary,  however,  that  the  approach  to 
normality  is  only  rapid  if  the  condition  that  the  several  drawings 
for  each  sample  shall  be  independent  is  strictly  fulfilled.  If  the 
observations  are  not  independent,  but  are  to  some  extent  positively 
correlated  with  each  other,  even  a  fairly  large  sample  may  con- 
tinue to  reflect  any  asymmetry  existing  in  the  original  distribution 
{cf.  ref.  32  and  the  record  of  sampling  there  cited). 

If  the  original  distribution  be  normal,  the  distribution  of 
means,  even  of  small  samples,  is  strictly  normal.  This  follows  at 
once  from  the  fact  that  any  linear  function  of  normally  distributed 
variables  is  itself  normally  distributed  (Chap.  XVJ.  §  6).  The 
distribution  will  not  in  general,  however,  be  normal  if  the 
deviation  of  the  mean  of  each  sample  is  expressed  in  terms  of  the 
standard-deviation  of  that  sample  {cf.  ref.  30). 

14.  Let  us  consider  briefly  the  effect  on  the  standard  error  of 
the  mean  if  the  conditions  of  simple  sampling  as  laid  down  in 
§  2  cease  to  apply. 

(a)  If  we  do  not  draw  from  the  same  record  all  the  time,  but 
first  draw  a  series  of  samples  from  one  record,  then  another 
series  from  another  record  with  a  somewhat  different  mean  and 
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standard-deviation,  and  so  on,  or  if  we  draw   the  successive 

samples  from  essentially  different  parts  of  the  same  record,  the  < 
standard  error  will  be  greatly  increased.    For  suppose  we  draw 

samples  from  the  first  record,  for  which  the  standard-deviation  j 

(in  an  indefinitely  large  sample)  is  cr^,  and  the  mean  differs  by  ; 

from  the  mean  of  all  the  records  together  (as  ascertained  by  * 

large  samples  in  numbers  proportionate  to  those  now  taken) ;  ^2  i 

samples  from  the  second  record,  for  which  the  standard- deviation  i 

is  o-g,  and  the  mean  differs  by      from  the  mean  of  all  the  records  ; 

together,  and  so  on.    Then  for  the  samples  drawn  from  the  first  ; 

record  the  standard  error  of  the  mean  will  be  arjjii,  but  the  j 

distribution  will  centre  round  a  value  differing  by  d^  from  the  j 

mean  for  all  the  records  together :  and  so  on  for  the  samples  j 
drawn  from  the  other  records.    Hence,  if  o-^  be  the  standard  error 

of  the  mean,      the  total  number  of  samples,  : 

ir.ai==i:(4j)+s(i.ri').  ; 

But  the  standard-deviation  o-q  for  all  the  records  together  is  given  ' 

by  j 

Kal  =  ^k(r')-^2(kd^),  I 

Hence,  writing  2(M^)  =  iV..s^,  ] 

„    (To  n  - 1  „  ! 

'^"•=«+— ^"   i 

This  equation  corresponds  precisely  to  equation  (2)  of  §  9,  Chap. 
XIV.  "The  standard  error  of  the  mean,  if  our  samples  are  drawn 
from  different  records  or  from  essentially  different  parts  of  the 
entire  record,  may  be  increased  indefinitely  as  compared  with  the 

value  it  would  have  in  the  case  of  simple  sampling.  If,  for  ' 
example,  we  take  the  statures  of  samples  of  n  men  in  a  number 

of  different  districts  of  England,  and  the  standard-deviation  of  all  ^ 
the  statures  observed  is  ctq,  the  standard-deviation  of  the  means 
for  the  different  districts  will  not  be  a-jjii,  but  will  have  some 
greater  value,  dependent  on  the  real  variation  in  mean  stature 
from  district  to  district. 

{h)  If  we  are  drawing  from  the  same  record  throughout,  but  ; 

always  draw  the  first  card  from  one  part  of  that  record,  the  \ 

second  card  from  another  part,  and  so  on,  and  these  parts  differ  j 

more  or  less,  the  standard  error  of  the  mean  will  be  decreased.  ' 
For  if,  in  large  samples  drawn  from  the  subsidiary  parts  of  the 

record  from  which  the  several  cards  are  taken,  the  standard-  ' 

deviations  are  o-j,  o-g,  .  .  .  .  o-„,  and  the  means  differ  by  d^,  d,^  1 
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.  ,  .  ,  dn  from  the  mean  for  a  large  sample  from  the  entire  record, 
we  have 


<Tl==\cr)  +  h{d^). 


Hence 


.      .      .      .  (10) 


_  12? 
/  n 


The  last  equation  again  corresponds  precisely  with  that  given  for 
the  same  departure  from  the  rules  of  simple  sampling  in  the  case 
of  attributes  (Chap.  XIV.  §  11.,  eqn.  4).  If,  to  vary  our  previous 
illustration,  we  had  measured  the  statures  of  men  in  each  of  n 
different  districts,  and  then  proceeded  to  form  a  set  of  samples 
by  taking  one  man  from  each  district  for  the  first  sample,  one 
man  from  each  district  for  the  second  sample,  and  so  on,  the 
standard-deviation  of  the  means  of  the  samples  so  formed  would 
be  appreciably  less  than  the  standard  error  of  simple  sampling 
arjjn.  As  a  limiting  case,  it  is  evident  that  if  the  men  in  each 
district  were  all  of  precisely  the  same  stature,  the  means  of  all  the 
samples  so  compounded  would  be  identical :  in  such  a  case,  in  fact, 
o-Q  =  s^,  and  consequently  o-^  =  0.  To  give  another  illustration,  if 
the  cards  from  which  we  were  drawing  samples  had  been  arranged 
in  order  of  the  magnitude  of  X  recorded  on  each,  we  would  get 
a  much  more  stable  sample  by  drawing  one  card  from  each 
successive  nth  part  of  the  record  than  by  taking  the  sample 
according  to  our  previous  rules — e.g.  shaking  them  up  in  a  bag 
and  taking  out  cards  blindfold,  or  using  some  equivalent  process. 

The  result  is  perhaps  of  some  practical  interest.  It  shows  that, 
if  we  are  actually  taking  samples  from  a  large  area,  different 
districts  of  which  exhibit  markedly  different  means  for  the 
variable  under  consideration,  and  are  limited  to  a  sample  of  n 
observations ;  if  we  break  up  the  whole  area  into  n  sub-districts, 
each  as  homogeneous  as  possible,  and  take  a  contribution  to  the 
sample  from  each,  we  will  obtain  a  more  stable  mean  by  this 
orderly  procedure  than  will  be  given,  for  the  same  number  of 
observations,  by  any  process  of  selecting  the  districts  from  which 
samples  shall  be  taken  by  chance.  There  may,  however,  be  a 
greater  risk  of  biassed  error.  The  conclusions  seem  in  accord 
with  common-sense. 

(c)  Finally,  suppose  that,  while  our  conditions  (a)  and  {h)  of  §  2 
hold  good,  the  magnitude  of  the  variable  recorded  on  one  card 
drawn  is  no  longer  independent  of  the  magnitude  recorded  on 
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another  card,  e.g.  that  if  the  first  card  drawn  at  any  sampling 
bears  a  high  value,  the  next  and  following  cards  of  the  same 
sample  are  likely  to  bear  high  values  also.  Under  these  circum- 
stances, if  denote  the  correlation  between  the  values  on  the 
first  and  second  cards,  and  so  on, 

=  - +2^,(^2  +  ^3+  ....  +r,3+  ....). 

There  are  ?i(n--l)/2  correlations;  and  if,  therefore,  r  is  the 
arithmetic  mean  of  them  all,  we  may  write 

<r?,.  =  ^\l+r(»-l)]        .       .       .  (11) 

As  the  means  and  standard-deviations  of  ar^,  a?,,  .  .  .  .  are  all 
identical,  r  may  more  simply  be  regarded  as  the  correlation 
coefficient  for  a  table  formed  by  taking  all  possible  pairs  of  the 
n  values  in  every  sample.  If  this  correlation  be  positive,  the 
standard  error  of  the  mean  will  be  increased,  and  for  a  given 
value  of  r  the  increase  will  be  the  greater,  the  greater  the  size  of 
the  samples.  If  r  be  negative,  on  the  other  hand,  the  standard 
error  will  be  diminished.  Equation  (11)  corresponds  precisely  to 
equation  (6),  §  13,  of  Chap.  XIV. 

As  was  pointed  out  in  that  chapter,  the  case  when  r  is  positive 
covers  the  case  discussed  under  (a) :  for  if  we  draw  successive 
samples  from  different  records,  such  a  positive  correlation  is  at 
once  introduced,  although  the  drawings  of  the  several  cards  at 
each  sampling  are  quite  independent  of  one  another.  Similarly, 
the  case  discussed  under  (b)  is  covered  by  the  case  of  negative 
correlation,  for  if  each  card  is  always  drawn  from  a  separate  and 
distinct  part  of  the  record,  the  correlation  between  any  two  or's  will 
on  the  average  be  negative  :  if  some  one  card  be  always  drawn 
from  a  part  of  the  record  containing  low  values  of  the  variable, 
the  others  must  on  an  average  be  drawn  from  parts  containing 
relatively  high  values.  It  is  as  well,  however,  to  keep  the  cases 
(a),  (6),  and  (c)  distinct,  since  a  positive  or  negative  correlation 
may  arise  for  reasons  quite  different  from  those  considered  under 
(a)  and  {b). 

15.  With  this  discussion  of  the  standard  error  of  the  arithmetic 
mean  we  must  bring  the  present  work  to  a  close.  To  indicate 
briefly  our  reasons  for  not  proceeding  further  with  the  discussion 
of  standard  errors,  we  must  remind  the  student  that  in  order  to 
express  the  standard  error  of  the  mean  we  require  to  know,  in 
addition  to  the  mean  itself,  the  standard-deviation  about  the  mean, 
or,  in  other  words,  the  mean  (deviation)-  with  respect  to  the  mean. 
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Similarly,  to  express  the  standard  error  of  the  standard-deviation 
we  require  to  know,  in  the  general  case,  the  mean  (deviation)'* 
with  respect  to  the  mean.  Either,  then,  we  must  find  this  quantity 
for  the  given  distribution — and  this  would  entail  entering  on  a 
field  of  work  which  hitherto  we  have  intentionally  avoided — or  we 
must,  if  that  be  possible,  assume  the  distribution  to  be  of  such  a 
form  that  we  can  express  the  mean  (deviation)*  in  terms  of  the 
mean  (deviation)^.  This  can  be  done,  as  a  fact,  for  the  normal 
distribution,  but  the  proof  would  again  take  us  rather  beyond 
the  limits  that  we  have  set  ourselves.  To  deal  with  the  standard 
error  of  the  correlation  coefficient  would  take  us  still  further 
afield,  and  the  proof  would  be  laborious  and  difficult,  if  not 
impossible,  without  the  use  of  the  diff'erential  and  integral  cal- 
culus. We  must  content  ourselves,  therefore,  with  a  simple 
statement  of  the  standard  errors  of  some  of  the  more  important 
constants. 

Standard-deviation. — Tf  the  distribution  be  normal, 


}- 


standard  error  of  the 

standard-deviation  in  V  •       .  (12; 

a  normal  distribution  '  '^^^ 


This  is  generally  given  as  the  standard  error  in  all  cases  :  it  is, 
however,  by  no  means  exact :  the  general  expression  is 

standard  error  of  the  standard-  j         /  2 

deviation  in  a  distribution  >=  ^ /  ^^^^^  .  (13) 
of  any  form  )      ^  • 

where  /i^  is  the  mean  (deviation)* — deviations  being,  of  course, 
measured  from  the  mean — and  the  mean  (deviation)-  or  the 
square  of  the  standard-deviation  :  n  is  assumed  sufficiently  large 
to  make  the  errors  in  the  standard-deviation  small  compared  with 
that  quantity  itself.  Equation  (13)  may  in  some  cases  give 
values  considerably  greater — twice  as  great  or  more — than  (12). 
(Of.  ref.  17.)  If,  however,  the  distribution  be  normal,  equation 
(12)  gives  the  standard  error  not  merely  of  standard-deviations  of. 
order  zero,  to  use  the  terminology  of  Chap.  XIL,  but  of  standard- 
deviations  of  any  order  (ref.  33).  It  will  be  noticed,  on  reference 
to  equation  (4)  above,  §  8,  that  the  standard  error  of  the  standard- 
deviation  is  less  than  that  of  the  semi-interquartile  range  for  a 
normal  distribution. 

For  a  normal  distribution,  again,  we  have — 


standard  error  of  the  co 
efficient  of  variation 


■.}-iT.I'-(B)'l'  <■<> 
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The  expression  in  the  bracket  is  usually  very  nearly  unity,  for 
a  normal  distribution,  and  in  that  case  may  be  neglected. 
Correlation  coefficient. — If  the  distribution  be  normal, 

standard  error  of  the  cor-  I      i  _  ^-2 

relation  coefficient  for  J>  = — .  •  (15) 
a  normal  distribution  )  ^''^ 

This  is  the  value  always  given  :  the  use  of  a  more  general  formula 
which  would  entail  the  use  of  higher  moments  does  not  appear 
to  have  been  attempted.  As  regards  the  case  of  small  samples, 
cf.  refs.  10,  28,  and  31.  Equation  (15)  gives  the  standard  error 
of  a  coefficient  of  any  order,  total  or  partial  (ref.  33).  For  the 
standard  error  of  the  correlation-coefficient  for  a  fourfold  table 
(Chap.  XL,  §  10),  see  ref.  34  :  the  formula  (15)  does  not  apply. 
Coefficient  of  regression. — If  the  distribution  be  normal, 

standard  error  of  the  co- )   ^ 

efficient  of  regression      |>  =  ^^LJ^lLlJI^  =  .  (16) 

for  a  normal  distribution  )        a^Jn  Jn 

This  formula  again  applies  to  a  regression  coefficient  of  any  order, 
total  or  partial :  i.e.  in  terms  of  our  general  notation,  k  denoting 
any  collection  of  secondary  subscripts  other  than  1  or  2, 

standard  error  of  ft^a.*  ^or  )  _  q'i.2jt 
a  normal  distribution   }  ~  a-,^,^  Jn. 

Correlation  ratio. — The  general  expression  for  the  standard 
error  of  the  correlation-ratio  is  a  somewhat  complex  expression 
(cf.  Professor  Pearson's  original  memoir  on  the  correlation-ratio, 
ref.  18,  Chap.  X.).  In  general,  however,  it  may  be  taken  as 
given  sufficiently  closely  by  the  above  expression  for  the  standard 
error  of  the  correlation  coefficient,  that  is  to  say, 

standard  error  of  correlation-  )  _  1  -  if 

ratio  approximately  ]  ~  Jn     '       '    v  ' ' 

As  was  pointed  out  in  Chap.  X.,  §  21,  the  value  of  t  =  T~^^  is  a 
test  for  linearity  of  regression.  Very  approximately  (Blakeman, 
ref.  1),  _ 

standard  error  of  ^  =  2    /     ^(i  _^2\2_(i  -r'^y+l  .  (18) 

For  rough  work  the  value  of  the  second  square  root  may  be 
taken  as  nearly  unity,  and  we  have  then  the  simple  expression, 

standard  error  of  {  roughly  =  •^y/'"    *        "  ^^^^ 
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To  convert  any  standard  error  to  the  probable  error  multiply  by 
the  constant  0-674489  .... 

16.  We  need  hardly  restate  once  more  tho  warnings  given  in 
Chap.  XIV.,  and  repeated  in  §  9  above,  that  a  standard  error  can 
give  no  evidence  as  to  the  biassed  or  representative  character  of 
a  sample,  nor  as  to  the  magnitude  of  errors  of  observation,  but 
we  may,  in  conclusion,  again  emphasise  the  warnings  given 
in  §§  1-3,  Chap.  XIV.,  as  to  the  use  of  standard  errors  when 
the  number  of  observations  in  the  sample  is  small. 

In  the  first  place,  if  the  sample  be  small,  we  cannot  in  general 
assume  that  the  distribution  of  errors  is  approximately  normal : 
it  would  only  be  normal  in  the  case  of  the  median  (for  which 
p  and  q  are  equal)  and  in  the  case  of  the  mean  of  a  normal 
distribution.  Consequently,  if  n  be  small,  the  rule  that  a 
range  of  three  times  the  standard  error  includes  the  majority 
of  the  fluctuations  of  simple  sampling  of  either  sign  does  not 
strictly  apply,  and  the  "probable  error"  becomes  of  doubtful 
significance. 

Secondly,  it  will  be  noted  that  the  values  of  cr  and  y^,  in  (1),  of 
fp  in  (2),  and  of  o-  in  (4)  and  (5),  i.e.  the  values  that  would  be 
given  for  these  constants  by  an  indefinitely  large  sample  drawn 
under  the  same  conditions,  or  the  values  that  they  possess  in 
the  original  record  if  the  sample  is  unbiassed,  are  assumed  to  be 
known  a  priori.  But  this  is  only  the  case  in  dealing  with  the 
problems  of  artificial  chance :  in  practical  cases  we  have  to  use 
the  values  given  us  by  the  sample  itself.  If  this  sample  is  based 
on  a  considerable  number  of  observations,  the  procedure  is  safe 
enough,  but  if  it  be  only  a  small  sample  we  may  possibly  mis- 
estimate the  standard  error  to  a  serious  extent.  Following  the 
procedure  suggested  in  Chap.  XIV.,  some  rough  idea  as  to  the 
possible  extent  of  under  estimation  or  over-estimation  may  be 
obtained,  e.g.  in  the  case  of  the  mean,  by  first  working  out  the 
standard  error  of  a-  on  the  assumption  that  the  values  for  the 
necessary  moments  are  correct,  and  then  replacing  cr  in  the 
expression  for  the  standard  error  of  the  mean  by  cr  ±  three  times 
its  standard  error  so  obtained. 

Finally,  it  will  be  remembered  that  unless  the  number  of 
observations  is  large,  we  cannot  interpret  the  standard  error  of 
any  constant  in  the  inverse  sense,  i.e.  the  standard  error  ceases 
to  measure  with  reasonable  accuracy  the  standard-deviation  of 
true  values  of  the  constant  round  the  observed  value  (Chap. 
XIV.  §  3).  If  the  sample  be  large,  the  direct  and  inverse 
standard  errors  are  approximately  the  same. 
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EXERCISES. 

1.  For  the  data  in  the  last  column  of  Table  IX.,  Chap.  VI.  p.  95,  find 
the  standard  error  of  the  median  (154*7  lbs.). 

2.  For  the  same  distribution,  find  the  standard  errors  of  the  two  quartiles 
(142-5  lbs.,  168-4  lbs.). 

3.  For  the  same  distribution,  find  the  standard  error  of  the  semi-inter- 
quartile range. 

4.  The  standard -deviation  of  the  same  distribution  is  21*3  lbs.  Find  the 
standard  error  of  the  mean,  and  compare  its  magnitude  with  that  of  the 
standard  error  of  the  median  (Qn.  1). 

5.  Woik  out  the  standard  error  of  the  standard  deviation  for  the  distribu- 
tion of  statures  used  as  an  illustration  in  §  6.  (Standard -deviation  2*57  in.  ; 
8585  observations.)  Compare  the  ratio  of  standard  error  of  standard- 
deviation  to  the  standard  deviation,  with  the  ratio  of  the  standard  error  of 
the  semi-interquartile  range  to  the  semi-inten^uartile  range,  assuming  the 
distribution  noi  mal. 

6.  Calculate  a  small  table  giving  the  standard  errors  of  the  correlation 
coefficient,  based  on  (1)  100,  (2)  1000  observations,  for  values  of  r  =  0,  0*2,  0'4, 
0*6,  0*8,  assuming  the  distribution  normal. 
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TABLES  FOR  FACILITATING  STATISTICAL  WORK. 
A.  CALCULATING  TABLES. 

For  heavy  arithmetical  work  an  arithmometer  is,  of  course, 
invaluable  ;  but,  owing  to  their  cost,  arithmetic  machines  are,  as  a 
rule,  beyond  the  reach  of  the  student.  For  a  great  deal  of  simple 
work,  especially  work  not  intended  for  publication,  the  student 
will  find  a  slide-rule  exceedingly  useful :  particulars  and  prices 
will  be  found  in  any  instrument  maker's  catalogue.  A  plain 
25-cm.  rule  will  serve  for  most  ordinary  purposes,  or  if  greater 
accuracy  is  desired,  a  50-cm.  rule,  a  Fuller  spiral  rule,  or  one  of 
Hannyngton-pattern  rules  (Aston  &  Mander,  London),  in  which 
the  scale  is  broken  up  into  a  number  of  parallel  segments,  may  be 
preferred.  For  greater  exactness  in  multiplying  or  dividing, 
logarithms  are  almost  essential :  five-figure  tables  sufiSce  if  answers 
are  only  desired  true  to  five  digits ;  if  greater  accuracy  is  needed, 
seven-figure  tables  must  be  used.  It  is  hardly  necessary  to  cite 
special  editions  of  tables  of  logarithms  here,  but  attention  may 
perhaps  be  directed  to  the  recently  issued  eight-figure  tables  of 
Bauschinger  and  Peters  (W.  Engelmann,  Leipzig,  and  Asher  &  Co., 
London,  1910;  vol.  i.  containing  logarithms  of  all  numbers  from 
1  to  200,000,  price  18s.  6d.  net. ;  vol.  ii.  containing  logs,  of 
trigonometric  functions). 

If  it  is  desired  to  avoid  logarithms,  extended  multiplication 
tables  are  very  useful.  There  are  many  of  these,  and  four  of 
different  forms  are  cited  below.  Zimmermann's  tables  are  inex- 
pensive and  recommended  for  the  elementary  student,  Cotsworth's, 
Crelle's,  or  Peters'  tables  for  more  advanced  work.  Barlow's  tables 
are  invaluable  for  calculating  standard-deviations  of  ungrouped 
observations  and  similar  work. 

(1)  Barlow's  Tables  of  Squares,  Cubes,  Square-roots,  Cube-roots,  and  Recip- 
rocals of  all  Integer  Numbers  up  to  10,000 ;   E.  k  V.  N.  Spon, 
London  and  New  York  ;  new  edition,  1930,  price  7s.  6d. 
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(2)  CoTSWORTH,  M.  B.,  The  Direct  Calmlator,  Series  0.    (Product  table  to 

1000  X  1000.)  M'Corquodale  &  Co.,  London  ;  price  with  thumb  index, 
25s.;  without  index,  21s. 

(3)  Crelli?,  a.  L.,  Rechentafeln.    (Multiplication  table  giving  all  products  up 

to  1000x  1000.)  Can  be  obtained  with  explanatory  introduction  in 
German  or  in  English.    G.  Reimer,  Berlin  ;  price  15s. 

(4)  Elderton,  W.  p.    "Tables  of  Powers  of  Natural  Numbers,  and  of  the 

Sums  of  Powers  of  the  Natural  Numbers  from  1  to  100"  (gives 
powers  up  to  seventh),  Biometrika,  vol.  ii.  p.  474. 

(5)  Peters,  J.,  Neue  Rechentafeln  filr  Multiplikation  und  Division.  (Gives 

products  up  to  100  x  10,000  :  more  convenient  than  Crelle  for  forming 
four-figure  products.  Introduction  in  English,  French  or  German.) 
G.  Reimer,  Berlin  ;  price  15s. 

(6)  ZiMMERMANN,    H. ,  Rechetitafel,  nebst  Sammlung  haufig  gebrauchter 

Zahlenwerthe.  (Products  of  all  numbers  up  to  100  x  1000  :  subsidiary 
tables  of  squares,  cubes,  square-roots,  cube-roots  and  reciprocals,  etc. 
for  all  numbers  up  to  1000  at  the  foot  of  the  page.)  W.  Ernst  k  Son, 
Berlin  ;  price  5s.  ;  English  edition,  Asher  &  Co.,  London,  6s. 

B.  SPECIAL  TABLES  OF  FUNCTIONS,  ETC. 

Several  tables  of  service  will  be  found  in  the  works  cited  in 
Appendix  IL,  e.g.,  a  table  of  Gamma  Functions  in  Elderton's 
book  (12)  and  a  table  of  six-figure  logarithms  of  the  factorials 
of  all  numbers  from  1  to  1100  in  De  Morgan's  treatise  (11).  The 
majority  of  the  tables  in  the  list  below,  which  were  originally 
published  in  Biometrika,  together  with  others,  are  contained  in 
Tables  for  Stutisticians  and  Biometricians,  Part  I.,  edited  by  Karl 
Pearson  (Biometric  Laboratory,  University  College,  London) ; 
price  15s.  net. 

(7)  Davenport,  C.  B.,  Statistical  Methods,  with  especial  reference  to  Bio- 

logical Variation ;  New  York,  John  Wiley  ;  London,  Chapman  & 
Hall  ;  second  edition,  1904.  (Tables  of  area  and  ordinates  of  the 
normal  curve,  gamma  functions,  probable  errors  of  the  coefficient  of 
correlation,  powers,  logarithms,  etc.) 

(8)  DuFFELL,  J.  H.,  "Tables  of  the  Gamma-function,"  Biometrika,  vol.  vii., 

1909,  p.  43.  (Seven-figure  logarithms  of  the  function,  proceeding  by 
differences  of  0"001  of  the  argument.) 

(9)  Elderton,  W.  p.,  "Tables  for  Testin^j  the  Goodness  of  Fit  of  Theory  to 

Observation,"  Biometrika,  vol.  i.,  1902,  p.  155. 

(10)  EvERiTT,  P.   F.,  "Tables  of  the  Tetrachoric   Functions  for  Four- 

fold Correlation  Tables,"  Biometrika,  vol.  vii.,  1910,  p.  437,  and  vol. 
viii.,  1912,  p.  385.  (Tables  for  fjicilitating  ihe  calculation  of  the  cor- 
relation coeflScient  of  a  fourfold  table  by  Pearson's  method  on  the 
assumption  that  it  is  a  grouping  of  a  normally  distributed  table  ;  cf. 
ref.  14  of  Chap.  XVI.) 

(11)  Gibson,  Winifred,  "Tables  for  Facilitating  the  Computation  of  Prob- 

able Errors,"  Biometrika,  vol.  iv.,  1906,  p.  385. 

(12)  Heron,  D.,  "An  Abac  to  determine  the  Probable  Errors  of  Correlation 

CoeflScients,"  Biometrika,  vol.  vii.,  1910,  p.  411.  (A  diagram  giving 
the  probable  error  for  any  number  of  observations  up  to  1000.) 

(13)  Lee,  Alice,  "Tables  of  F{r,  v)  and  H{r,  v)  Functions,"  British  Associa- 

tion Report,  1899.  (Functions  occurring  in  connection  with  Professor 
Pearson's  frequency  curves.) 
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(14)  Lee,  Alick,  "Tables  of  the  Gaussian  '  Tail -functions, '  when  the  '  tail ' 

is  larger  than  the  body,"  Biometrika,  vol.  x.,  1914,  p.  208. 

(15)  Rhind,  A.,  "Tables  for  Facilitating  the  Computation  of  Probable  Errors 

of  the  Chief  Constants  of  Skew  Frequency-distributions,"  Biometrika, 
vol.  vii.,  1909-10,  p.  127  and  p.  386. 

(16)  Sheppakd,  W.  F.,  "New  Tables  of  the  Probability  Integral,"  ^rome^n'Ara, 

vol.  ii, ,  1903,  p.  174.  (Includes  not  merely  table  of  areas  of  the  normal 
curve  (to  seven  figures),  but  also  a  table  of  the  ordinates  to  the  same 
degree  of  accuracy. ) 

(17)  Sheppard,  W.  F.,  "Table  of  Deviates  of  the  Normal  Curve"  (with 

introductory  article  on  Grades  and  Deviates  by  Sir  Francis  Galton), 
Biometrika,  vol.  v.,  1907,  p.  404.  (A  table  giving  the  deviation  of 
the  normal  curve,  in  terms  of  the  standard-deviation  as  unit,  for  the 
ordinates  which  divide  the  area  into  a  thousand  equal  parts.) 

A  number  of  useful  tables  will  be  found  in  the  series  "Tracts 
for  Computers,"  published  by  the  Cambridge  University  Press  for 
the  Department  of  Applied  Statistics,  University  College,  London. 
A  list  is  usually  given  in  the  advertisement  pages  of  the  current 
issue  of  Biometrika. 
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SHORT  LIST  OF  WORKS  ON  THE  MATHEMATICAL 
THEORY  OF  STATISTICS  AND  THE  THEORY  OF 
PROBABILITY. 

Thb  student  may  find  the  following  short  list  of  service,  as 
supplementing  the  lists  of  references  given  at  the  ends  of  the 
several  chapters,  the  latter  containing,  as  a  rule,  original  memoirs 
only.  The  economic  student  who  wishes  to  know  more  of  the 
practical  side  of  statistics  may  be  referred  to  Mr  A.  L.  Bowley's 
"Elements"  (6  below),  to  An  Elementary  Manual  of  Statistics 
(3rd  ed.,  Macdonald  <fe  Evans,  London,  1925),  by  the  same  writer 
(useful  as  a  general  guide  to  English  statistics),  and  to  M.  Jacques 
Bertillon's  Gours  elementaire  de  statistique  (Soci^t^  d 'Editions 
scientifiques,  1895 :  international  in  scope).  Dr  A.  Newsholme's 
Vital  Statistics  (Swan  Sonnenschein,  3rd  edn.,  1899)  will  also  be 
of  service  to  students  of  that  subject. 

The  great  majority  of  the  works  mentioned  in  the  following 
list,  with  others  which  it  has  not  been  thought  necessary  to 
include,  are  in  the  library  of  the  Royal  Statistical  Society. 

(1)  Airy,  Sir  G.  B.,  On  the  Algebraical  and  Nunurical  Theory  of  Errors  oj 

Observations',  1st  edn.,  1861  ;  3rd  edn.,  1879. 

(2)  Bernoulli,  J.,  Ars  conjectandi,  opus  posthumum:  Accedit  tract atus  de 

seriebus  infinitis,  et  epistola  gallic^  scripta  de  ludo  pilac  reticularis, 
1713.     (A  German  translation  in  Ostwald's  Klassiker  der  exaJcten 
Wissenschaften,  Nos.  107,  108.) 
(8)  Bertrand,  J.  L.  F.,  Calcul  des  probability ;  Gauthier-Villars,  Paris,  1889. 

(4)  Betz,  W.,  Ueber  Korrelation ;  Beihefte  zur  Zeitschrift  fUr  ang.  Psych. 

und  psych.  Sammelforschung  ;  J.  A.  Earth,  Leipzig,  1911.  (Applica 
tions  to  psychology.) 

(5)  BoREL,  E.,  Eliments  de  la  thiorie  des  probabilitds ;  Hermann,  Paris,  1909. 

(6)  BowLEY,  A.  L.,  Elements  of  Statistics ;  P.  S.  King,  London  ;  1st  edn., 

1901  ;  3rd  edn.,  1907. 

(7)  Brown,  W.,  The  Essentials  of  Mental  Measurement  ;  Cambridge  Uni- 

versity Press,  1911.  (Part  2  on  the  theory  of  correlation  :  applications 
to  experimental  psychology. ) 

(8)  Bruns,     H.,     WahrscheinlichktUsrechnung    und    Kollektivttw^lehre , 

Teubner,  Leipzig,  1906. 
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(9)  CoURNOT,  A.  A.,  Exposition  de  la  thiorie  des  chances  et  des  prohdbiliUs^ 
1843. 

(10)  CzuBER,  E.,   Wahrscheinlichkeitsrechnung  und  ihre  Anwendung  auf 

Fehlerausgleichung,  Stotistik  und  Lehensversicherung  ;  Teubner, 
Leipzig,  2nd  edn.,  vol.  i.,  1908-10. 

(11)  De  Mohgan,  a,,  Treatise  on  the  Theory  of  Probabilities  (extracted  from 

the  Encyclopcedia  Metropolitana),  1837. 

(12)  Elderton,  W.  p.,  Frequency  Curves  and  Correlation  ;  C.  &  E.  Layton, 

London,  1906.  (Deals  with  Professor  Pearson's  frequency  curves  and 
correlation,  with  illustrations  chiefly  of  actuarial  interest.) 

(13)  Feohner,  G.  T.,  Kollektivmasslehre  (posthumously  published  ;  edited 

by  G.  F.  Lipps) ;  Engelmann,  Leipzig,  1897. 

(14)  Galloway,  T.,  Treatise  on  Probability  (republished  from  the  7th  edn. 

of  the  Encyclopcedia  Britannica),  1839. 

(15)  Gauss,  C.  F.,  M6thode  des  moindres  carr&s:  Mimoires  sur  la  combinaison 

des  observations,  traduits  par  J.  Bertrand,  1855. 

(16)  JoHANNSEN,  W.,  Elementc  der  exakten  Erblichkeitslehre  ;  Fischer,  Jena, 

2*®  Ausgabe,  1913.  (Very  largely  concerned  with  an  exposition  of  the 
statistical  methods. ) 

(17)  Laplace,  Pierre  Simon,  Marquis  de,  Essai  philosophique  sur  les 

probabilitds,  1814.  (The  introduction  to  18,  separately  printed  with 
some  modifications.) 

(18)  Laplace,  Pierre  Simon,  Marquis  de,  Th6orie  analytique  des  probabilit4s ; 

2nd  edn.,  1814,  with  supplements  1  to  4. 

(19)  Lexis,  W.,  Abhandlungen  zur  Tlieorie  der  Bevolkerungs-  und  Moral- 

staiistik  ;  Fischer,  Jena,  1903. 

(20)  PoiNCARifi,  H.,  Calcul  des  probability  ;  Gauthier-Villars,  Paris,  1896. 

(21)  PoissoN,  S.  D.,  Recherches  sur  la  probability  des  jugements  en  mati^re 

criminelle  et  en  matiere  civile,  prec6d4es  des  regies  genirales  du  calcul 
des probabihtis,  1837.    (German  translation  by  C.  H.  Schnuse,  1841.) 

(22)  Quetelet,  L.  a.  J. ,  Lettres  sur  la  thdorie  des  probabilit6s,  appliquie  aux 

sciences  morales  et  politiqucs,  1846.  (English  translation  by  0.  G. 
Downes,  1849.) 

(23)  Thorndike,  E.  L  ,  ^71  Introduction  to  the  Theory  of  Mental  and  Social 

Measurements,  Science  Press,  New  York,  1904. 

(24)  Venn,  J.,  The  Logic  of  Chance:  an  Essay  on  the  Foundations  and 

Province  of  the  Theory  of  Probability,  with  especial  reference  to  its 
Logical  Bearings  and  its  Application  to  Moral  and  Social  Science  and  to 
Statistics;  3rd  edn.,  Macmillan,  London,  1888. 
(26)  Westergaard,  H.,  Die  GrundzUge  der  Theorie  der  Statistik  ;  Fischer, 
Tena,  1890. 
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I.  NOTES  SUPPLEMENTARY  TO  CHAPTER  VI. 

6.  Position  of  Intervals. — It  is  said  in  the  text  that  in  some 
exceptional  cases  the  observations  exhibit  a  marked  clustering 
round  certain  values.  The  word  exceptional  should  hardly  have 
been  used.  Whenever  there  is  some  doubt  as  to  the  final  dicjit 
in  reading  a  scale,  scope  is  given  to  the  idiosyncrasies  of  the 
observer  and  the  distribution  of  frequency  OA^er  the  final  digits 
is  rarely  uniform.  The  most  conspicuous  feature  is  usually  the 
tendency  to  round  off  to  the  nearest  unit,  thus  making  0  the 
most  frequent  final  digit,  but  5's  may  also  be  emphasised  if 
emphasised  on  the  scale  itself,  and  the  excesses  of  O's  and  5's 
may  be  drawn  in  the  most  diverse  ways  from  the  other  parts  of 
the  scale. 


Table  A. — Freqicency -distributions  of  Final  Digits  in  Measnrenicnts  by 
Four  Observers. 


Final  Digit. 

Frequency  of  Final  Digit  ])er  1000. 

A. 

B. 

C. 

D. 

0 

1 
2 
3 
4 
5 
6 
7 
8 
9 

158 
97 

125 
73 
76 
71 
90 
56 

126 

129 

122 
98 
98 
90 

100 

112 
98 
99 

101 
81 

251 
37 
80 
72 
55 

222 
71 
75 
72 
65 

358 
49 
90 
63 
37 

211 
62 
70 
44 
16 

Total 

1001 

999 

1000 

1000 

Actual  ob- 
servations  / 

1258 

3000 

1000 

1000 
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Table  A  shows  results  for  four  observers  as  illustrations,  the 
frequencies  being  reduced  for  comparability  to  a  total  of  1000. 
Column  A  is  based  on  measures  by  myself,  on  drawings,  to  the 
nearest  tenth  of  a  millimetre.  It  is  recognised,  of  course,  that 
measures  cannot  really  be  made  to  such  a  degree  of  precision ; 
but  I  believed  that  I  was  making  them  carefully,  and  as  they 
were  made  with  a  Zeiss  scale,  in  which  the  divisions  are  ruled 
on  the  under  side  of  a  piece  of  plate-glass,  readings  are  unaffected 
by  parallax.  Nevertheless  it  will  be  seen  that  I  heavily  over- 
emphasised the  zeros,  and  also  2,  8,  and  9 — an  odd  selection  of 
preferences !  On  the  whole,  the  centre  of  the  millimetre  was 
neglected  and  measures  piled  up  at  the  two  ends. 

The  data  for  columns  B,  C,  and  D  were  all  drawn  from  the 
same  published  report,  and  refer  to  sundry  head  measurements 
taken  on  the  living  subject.  Guided  by  a  statement  in  the  intro- 
duction, it  was  possible  to  compile  the  data  separately  for  the 
three  assistants  (B,  C,  D)  who  had  done  the  actual  measuring. 
It  will  be  seen  that  B  was  rather  good  :  there  is  a  relatively  slight 
excess  at  0  and  5,  but  otherwise  his  measurements  are  fairly 
uniformly  distributed.  C  was  decidedly  not  good,  rounding  off 
nearly  one  measurement  in  two  to  the  nearest  centimetre  or 
half-centimetre.  D  was  simply  outrageously  bad — so  bad  that 
it  might  have  been  better  not  to  publish  his  measurements. 
Nearly  57  per  cent,  of  his  measurements  are  made  only  to  the 
nearest  centimetre  or  half  centimetre — a  quite  inadequate  degree 
of  precision  for  head  measurements  often  only  a  few  centimetres 
in  magnitude. 

Compilation  of  data  in  the  form  of  Table  A  is  recommended  as 
some  control  of  their  value,  and  as  a  check  on  assistants. 

15.  The  Extrernely  Asymmetrical  or  J-shaped  Distribution. — 
Dr  J.  C.  Willis  has  shown  that  any  number  of  illustrations  of 
this  form  of  distribution  may  be  obtained  by  compiling  the 
frequency  distribution  for  numbers  of  genera  with  1,  2,  3  .  .  . 
species  in  any  biological  group.  Table  B  shows  the  distribution 
for  the  Chrysomelid  beetles. 


[Table 
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Table  B. — Chrysomelidce  (beetles).  Numbers  of  Genera  with  1,  2,  3  .  .  . 
Species.  (Compiled  by  Dr  J.  C.  Willis,  F.R.S.  ;  cited  from  G.  U.  Yule, 
"A  Mathematical  Theory  of  Evolution  based  on  the  Conclusions  of  Dr 
J.  C.  Willis,"  Phil.  Trans.,  B,  vol.  ccxiii.  1924,  p.  85. 


Species. 

Genera. 

Species. 

Genera. 

Species. 

j 

Genera. 

1 

215 

32 

1 

74 

2 

90 

33 

1 

76 

3 

38 

34 

1 

77 

4 

35 

35 

1 

79 

5 

21 

36 

3 

83 

6 

16 

37 

1 

84 

7 

15 

38 

1 

87 

8 

14 

39 

2 

89 

9 

5 

40 

2 

92 

10 

15 

41 

1 

93 

11 

8 

43 

4 

110 

12 

9 

44 

1 

114 

13 

5 

45 

1 

115 

14 

6 

46 

1 

128 

15 

8 

49 

2 

132 

16 

6 

50 

4 

133 

17 

6 

52 

1 

146 

I 

18 

3 

53 

1 

163 

\ 

19 

4 

56 

1 

196 

20 

3 

58 

1 

217 

21 

4 

59 

1 

227 

22 

4 

62 

1 

264 

23 

5 

63 

3 

327 

24 

4 

65 

1 

399 

25 

2 

66 

1 

417 

26 

3 

67 

1 

681 

27 

1 

69 

1 

28 

3 

71 

1 

29 

3 

72 

1 

Total 

627 

30 

3 

73 

1 
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II.  DIRECT  DEDUCTION  OF  THE  TORMUh^ 
FOR  REGRESSIONS. 

{Supplementary  to  Chapters  IX.  and  XII.) 

To  those  who  are  acquainted  with  the  differential  calculus  the 
following  direct  proof  may  be  useful.  It  is  on  the  lines  of  the 
proof  given  in  Chapter  XII.  §  3. 

Taking  first  the  case  of  two  variables  (Chapter  IX.),  it  is 
required  to  determine  values  of  a-^  and     in  the  equation 

x  =  a^  +  \.y 

(where  x  and  y  denote  deviations  from  the  respective  means) 
that  will  make  the  sum  of  the  squares  of  the  errors  like 

u  =  x'  —  a^  +  b-^.y' 

a  minimum,  x'  and  y'  being  a  pair  of  associated  deviations. 

The  required  equations  for  determining  a^  and  will  be  given 
by  differentiating 

^(u^)  =  ^{x-a,  +  b^.yf 

with  respect  to     and  to  6j  and  equating  to  zero. 
Differentiating  with  respect  to  a^,  we  have 

2(a:-^iTV^)  =  0. 
But  ^{x)  =  %{y)  =  0, 

and  consequently  we  have      a^  =  0. 

Dropping  a^,  and  differentiating  with  respect  to  J^, 
^{x-b^ .  y)y  =  0. 

That  is,  b,M  r--^, 

as  on  p.  171. 

Similarly,  if  we  determine  the  values  of  a.^  and  h.j  in  the 
equation 

y  =  a.y-¥  b.,x 

that  will  make  the  sum  of  the  squares  of  the  errors  like 
v  =  y'  -  a^  +  b.T, .  X 

a  minimum,  we  will  lind 

02  =  0 
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If,  as  in  Chapter  XII.  §§  4  et  seq.  (cf.  especially  §  7),  a  number 
of  variables  are  involved,  the  equations  for  determining  the 
coefficients  will  be  given  by  differentiating 

%{Xi  —  ^12.34  ,  .  .  .  n  •  ^2+  •    •   •   •  +  ^ln.23  ....  (n-l)  •  ^n)^ 

with  respect  to  each  coefficient  in  turn  and  equating  the  result  to 
zero.  This  gives  the  equations  of  the  form  there  stated.  If  a 
constant  term  be  introduced,  its  "least  square"  value  will  be 
found  to  be  zero,  as  above. 

III.  THE  LAW  OF  SMALL  CHANCES. 

{Supplementary  to  Chapter  XV,) 

We  have  seen  that  the  normal  curve  is  the  limit  of  the  binomial 
{p  +  when  71  is  large  and  neither  p  nor  q  very  small.  The 
student's  attention  will  now  be  directed  to  the  limit  reached 
when  either  p  ov  q  becomes  very  small,  but  n  is  so  large  that 
either  up  or  nq  remains  finite. 

Let  us  regard  the  n  trials  of  the  event,  for  which  the  chance  of 
success  at  each  trial  is  p,  is  made  up  of  m  +  m  =  7i  trials ;  then 
the  probability  of  having  at  least  m  successes  in  the  m  +  m 
trials  is  evidently  the  sum  of  the  m  +  1  terms  of  the  expansion 
of  {p  +  qY  beginning  with  p^.  But  this  probability,  which  we 
may  term  P^,  can  be  expressed  in  another  and  more  convenient 
form  with  the  help  of  the  following  reasoning.  The  required 
result  might  happen  in  any  one  of  m  +  1  ways.    F'or  instance  : — 

(a)  Each  of  the  first  m  trials  might  succeed ;  the  chance  of 
this  is  p''^. 

(b)  The  first  m  +  \  trials  might  give  m  successes  and  1  failure, 
the  latter  not  to  happen  on  the  {vi  +  1)^^  trial  (a  condition  already 
covered  by  (a)).  But  the  probability  of  7/1  successes  and  1  failure, 
the  latter  at  a  specified  trial,  is  .  q,  and,  as  the  failure  might 
occur  in  any  one  of  m  out  of  m  +  1  trials,  the  complete  probability 
of  (b)  is  inp'^ .  q. 

(c)  The  first  m  +  2  trials  might  give  j?i  successes  and  2  failures, 
the  (m  +  2)*'*  trial  not  to  be  a  failure  (so  as  to  avoid  a  repetition 
of  either  of  the  preceding  cases) ;  the  probability  of  this  is 

2  !  ^  ^  • 

In  a  similar  way  we  find  for  the  contribution  of  m  4-  3  trials, 
giving  m  successes  and  3  failures, 

m(m  +  l)(m  +  2)  ^  3 
3!          ^  ^ 
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Ultimately  we  reach 

,       ,  m(m+  1)  o  ,           m  Am-\-\)  .  .  .  .  (m  +  m  —  \)  „,~| 
\+mq+    ^  ^  . — 'q^+  ....  '-q^ 

L  ^  !  ^  !  -J 

This  expression  is  of  course  equivalent  to  the  first  m'  +  1  terms  of 
the  binomial  expansion  beginning  with  as  the  student  can 
verify.    For  instance,  \i  m  =  n  -2,  so  that  ni  =  2,  we  have 

+  np''  -  ^q  +  ^(^~^)pn  -  2^2, 
'2i ! 

m' 

Let  us  now  suppose  that  q  is  very  small,  so  that  —  =  ratio  of 
failures  to  total  trials  is  also  very  small.    Let  us  also  suppose 

that  n  is  so  large  that  nq  =  A  is  finite.    Writing  qz=~  and  putting 

n 

m  =  n-m\  (7)  becomes 

(■-3"('-r['*»*o*s---a' 

since  —  and  smaller  fractions  can  be  neglected. 
n 

But  M  \   is  shown  in  books  on  algebra  to  be  equal  to 

where  e  -is  the  base  of  the  natural  logarithms,  when  n  is  infinite 
and,  under  similar  conditions, 

Hence,  if  n  be  large  and  q  small,  we  have 

If  we  put  m  =  0,  we  have  the  chance  that  the  event  succeeds 
every  time,  and  (8)  reduces  to  e-\  Put  m'  =  l,  and  we  get  the 
chance  that  the  event  shall  not  fail  more  than  once,  e~^{\  +A.),  so 
that  e'^.Xis  the  chance  of  exactly  one  failure,  and  the  terms 
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within  the  bracket  give  us  the  proportional  frequencies  of  0,  1,  2, 
etc.  failures.  In  other  words,  (8)  is  the  limit  of  the  binomial 
(jD  +  5')"  when  q  is  very  small  but  nq  finite. 

The  investigation  contained  in  the  preceding  paragraphs  was 
published  in  1837  by  Poisson,  so  that  (8)  may  be  termed  Poisson's 
limit  to  the  binomial ;  but  the  result  has  been  reached  indepen- 
dently by  several  writers  since  Poisson's  time,  and  we  shall  give 
one  of  the  methods  of  proof  adopted  by  modern  statisticians,  which 
the  student  may  perhaps  find  easier  to  follow  than  that  of  Poisson 
(see  ref.  19,  p.  273). 

(j^  +  ?)"  =  (l-?  +  ?r  =  (l-#(l  +  j4^)^       .  (9) 

The  first  bracket  on  the  right  is  equal  to  when  q  is  inde- 
finitely small.    Expanding  the  second  bracket,  we  have 

1+^  ? +i(i_iL).(^y+ .... 

q    \  -  q         2!  \1-^/ 
The  ratio  of  the  {r+  1)^^  to  the  r^^  term  is 

  ....  (9a) 

1  -  ^  r 

which  reduces  to  -  when  q  is  very  small.    The  convergence  of 
r 

the  series  is  seen  from  the  fact  that  r  cannot  exceed  -,  and  the 

9. 

substitution  of  this  value  in  (9a)  reduces  it  to 


(I-S)X' 

which  vanishes  with  q. 

Hence  the  second  bracket  on  the  right  of  (9)  may  be  written 

and  (9)  is 
identical  with  (8). 
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The  frequent  rediscovery  of  this  theorem  is  due  to  the  fact  that 
its  value  is  felt  in  the  study  of  problems  involving  small,  inde- 
pendent probabilities.  For  instance,  if  we  desired  to  find  the 
distribution  of  n  things  in  N  pigeon-holes  (all  the  pigeon-holes 
being  of  equal  size  and  equally  accessible),  N  being  large,  the  dis- 
tribution given  by  the  binomial 


would  be  effectively  represented  by  (8),  tables  of  which  for 
different  values  of  A.  have  been  published  by  v.  Bortkewitsch  and 
others. 

The  theorem  has  also  been  applied  to  cases  in  which,  although 
the  actual  value  of  q  (or  p)  is  unknown,  it  may  safely  be  assumed 
to  be  very  small.  It  should  be  noticed  that,  if  (8)  is  the  real  law 
of  distribution,  certain  relations  must  obtain  between  the  con- 
stants of  the  statistics  (see  par.  12,  Chapter  XIII.).  Using  the 
method  of  par.  6,  Chapter  XV.,  we  have  for  the  mean 


and  for  o-^ 


Hence  any  statistics  produced  by  causes  conforming  to  Poisson's 
limit  should,  within  the  limits  of  sampling,  have  the  mean  equal 
to  the  square  of  the  standard  deviation.  For  instance,  in  the 
statistics  used  in  par.  12  of  Chapter  XIII.,  the  mean  is  61, 
^=.•78,  (r2=-6079, 

24 
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If  we  now  compute  the  theoretical  frequencies  from  (8),  putting 
\=  -61,  we  have  the  following  results  : — 


Deaths. 

Actual 
Frequency. 

Frequency  assigned 
by  Poisson's  Limit. 

0 

109 

1087 

1 

65 

66-3 

2 

22 

20-2 

8 

3 

4-1 

4 

1 

•7  (4  and  over) 

The  agreement  here  is  excellent,  but  such  a  concordance  is  not 
very  common  in  actual  statistics.  Cases  do,  however,  occur  in 
which  the  method  is  of  service,  and  the  advanced  student  will  find 
that  the  reasoning  illustrated  is  of  value  in  many  theoretical 
investigations. 


IV.  GOODNESS  OF  FIT. 

{Supplementary  to  Chapter  XVII.) 

In  par.  15,  Chapter  XV.  (p.  308),  it  was  remarked  that  the  general 
treatment  of  the  problem,  whether  the  discrepancies  between 
any  system  of  observed  frequencies  and  those  postulated  by  a 
theoretical  law  might  have  arisen  by  the  operation  of  simple 
sampling,  was  beyond  the  scope  of  this  work.  As,  however,  the 
student  will  find  in  the  course  of  his  reading  that  a  test  of  this 
character  is  often  applied  in  practical  problems,  the  following 
notes  may  be  of  service  by  way  of  comment  on,  or  elucidation 
of,  the  highly  technical  papers  in  which  the  subject  is  fully 
discussed  (see  refs.  22  and  23,  p.  315,  and  also  additional 
refs.  on  p.  394). 

The  student  who  has  followed  the  argument  leading  up  to 
the  table  on  p.  310  will  have  perceived  that,  when  the  frequency 
distribution  of  a  variable  is  known,  the  probability  that  a  set  of 
observations  departing  from  the  most  likely  value  would  occur 
can  be  evaluated  by  comparing  the  portion  of  area  bounded  by 
the  ordinate  corresponding  to  the  observed  deviation  with  the 
whole  area  of  the  theoretical  curve,  and  the  work  is  illustrated 
in  Examples  i.-iv.  of  pp.  311-313.  In  this  case  there  is  only  a 
single  variable,  and  the  test  for  goodness  of  fit  is  reduced  to  its 
simplest  terms.    But  a  consideration  of  Chapter  XVI.,  and  the 
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relation  there  shown  to  hold  between  the  normal  curve  and  the 
surface  of  normal  correlation,  at  once  suggests  that  the  same 
principle  will  apply  when  there  are  two  variables. 

It  was  proved  on  pp.  319-321  that  the  contours  of  a  normal 
surface  are  a  system  of  concentric  ellipses.  Now  suppose  we 
have  a  normal  system  of  frequency  in  two  variables  x  and  y, 
then  the  chance  that  on  simple  sampling  we  should  obtain  the 
combination  x  y  is  measured  by  the  corresponding  ordinate  of 
the  surface,  and  the  feet  of  all  ordinates  of  equal  height  will  lie 
upon  an  ellipse  which  will  therefore  be  the  locus  of  all  combina- 
tions of  X  and  y  equally  likely  to  occur  as  is  x  y  .  An^  combina- 
tion more  likely  to  occur  than  x  y  will  have  a  talleV  ordinate, 
and  as  the  locus  of  its  foot  must  also  be  an  ellipse,  that  ellipse 
will  be  contained  within  the  x  y  ellipse.  Conversely,  combina- 
tions less  likely  to  occur  than  x  y  will  be  represented  by 
ordinates  located  upon  ellipses  wholly  surrounding  the  x  y 
ellipse.  Hence,  if  we  dissect  the  surface  into  indefinitely  thin 
elliptical  slices  and  determine  the  total  volumes  of  the  sum  of 
the  slices  from  x  =  x  and  y  =-y'  down  to  a;  =  0  and  y  =  0,  this 
volume  divided  by  the  total  volume  of  the  surface  will  be  the 
probability  of  obtaining  in  sampling  a  result  not  worse  than 
x  y\  or,  if  we  prefer,  we  may  sum  from  'x  =  x\  y  =  y'  to 
x  =  y  =  cOy  and  then  the  fraction  is  the  chance  of  obtaining  as 
bad  a  result  as  x  y\  or  a  worse  result. 

The  reader  who  has  compared  the  figures  on  p.  166  and 
p.  246,  and  followed  the  algebra  of  pp.  331-332,  will  have  no 
difficulty  in  seeing  that,  when  the  number  of  variables  is 
3,  4  ....  71,  the  above  principle  remains  valid  although  it 
ceases  to  be  possible  to  give  a  graphic  representation.  With 
three  variables  the  contour  ellipse  becomes  an  ellipsoidal  surface, 
and  the  four-dimensioned  frequency  "  volume  "  must  be  dissected 
into  tridimensional  ellipsoids;  with  four  variables  another 
dimension  is  involved,  and  so  on ;  but  throughout  the  equation 
of  the  contour  of  equal  probability  is  of  the  ellipse  type  \cf.  the 
generalisation  of  the  theorems  of  Chapter  IX.  in  Chapter  XII.). 
Let  us  now  suppose  that  if  a  certain  set  of  data  is  derived 
from  a  statistical  universe  conforming  to  a  particular  law,  these 
data,  N  in  number,  should  be  distributed  into  w  -f  1  groups  con- 
taining respectively  n^,  .  .  .  .  w„  each.  Instead  of  this 
we  actually  find  m^,  m^,       .  .  .  .  m„,  where 

TTIq  4-  Wlj  -f-    .   .  .   .  Win  =  '^0  +  ^1  +    •   •   •  •  1^n  = 

The  problem  to  be  solved  is  whether  the  observed  system  of 
deviations  from  the  most  probable  values  might  have  arisen  in 
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random  sampling.  Since,  iV  being  given,  fixing  the  contents  of 
any  n  of  the  classes  determines  the  +  1*-^,  there  are  only  n 
independent  variables.  Let  us  now  suppose  that  the  distribution 
of  deviations  is  normal.  Then  the  equation  of  the  frequency 
"solid"  is  of  the  type  set  out  in  equation  (12)  of  p.  331,  which 
we  will  write  for  the  present  in  the  form 


X^  =  a  constant,  is  then  the  equation  of  the  "ellipsoid"  delimiting 
the  two  portions  of  the  "volume"  corresponding  to  combina- 
tions more  or  less  likely  to  occur  than  7?iQ,  m^,  .  .  .  .  m„. 
Accordingly,  to  find  the  chance  of  a  system  of  deviations  as 
probable  as  or  less  probable  than  that  observed,  we  have  to 
dissect  the  frequency  solid,  adding  together  the  elliptic  elements 
from  the  ellipsoid  ^  to  the  ellipsoid  oo ,  and  to  divide  this  ' 
summation  by  the  total  volume,  i.e.  the  summation  from  the 
ellipsoid  0  to  the  ellipsoid  oo  . 

In  this  book  we  have  been  concerned  with  summations  the  | 
elements  of  which  were  finite.  The  reader  is  probably  aware  j 
that  when  the  element  summed  is  taken  indefinitely  small  the  I 
summation  is  called  an  integration,  the  symbol  /  replacing  2  or  S,  | 
and  the  infinitesimal  element  being  written  dx.  In  the  present  j 
case  we  have  to  reduce  an  n-fold  integral  the  summation  relating  j 
to  n  elements  cfo^j,  dx^^  etc.  To  reduce  this  ?i-fold  integral  to  a  j 
single  integral,  the  following  method  is  adopted.  In  the  first  ! 
place  the  ellipsoid,  referred  to  its  principal  axes,  is  transformed  \ 
into  a  spheroid  by  stretching  or  squeezing,  and  the  system  of  ; 
rectangular  co-ordinates  transformed  into  polar  co-ordinates.  ' 

The  reason  for  adopting  the  latter  device  is  that,  when  two 
rectangular  elements  dx,  dy  are  transformed  to  polar  co-ordinates,  j 
we  replace  them  by  an  angular  element  dB,  a  vectorial  element  o?r, 
and  a  term  in      the  radius  vector.    When  n  such  elements  are  < 
transformed,  the  integral  vectorial  factor  is  raised  to  the  n  -  l*-**  ' 
power  and  there  is  an  infinitesimal  vectorial  element,  di\  and  a 
*'  solid "  angular  element.    But  as  the  limits  of  integration  of 
the  angular  (not  of  the  vectorial)  element  will  be  the  same  in 
the  numerator  and  denominator,  these  cancel  out,  while  \  may  i 
be  treated  as  the  vectorial  element  or  ray.    Hence  the  multiple 
integral  reduces  to  a  single  integral  and  the  expression  becomes 
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the  reduction  of  which,  its  integration,  can  be  effected  in  terms 
of  X  methods  described  in  text-books  of  the  integral  calculus. 
Everything  turns,  therefore,  upon  the  computation  of  the  function  x- 

As  we  have  seen,  x^  is  determined  by  evaluating  the  standard 
deviations  of  the  n  variables  and  their  correlations  two  at  a  time 
(the  higher  partials  being  deducible  if  the  correlations  of  zero 
order  are  known). 

By  an  application  of  the  method  of  p.  257,  we  have 

for  the  standard  error  of  sampling  in  the  content  of  the  p^^  class  ; 
while  by  a  similar  adaptation  of  the  reasoning  on  p.  342  we  reach 

for  the  correlation  of  errors  of  sampling  in  the  p^^  and  q^^  classes. 
With  these  data,  x^  can  be  deduced  (the  actual  process  of  reduc- 
tion is  somewhat  lengthy,  but  the  student  should  have  no  difficulty 
in  following  the  steps  given  in  pp.  370-2  of  ref.  116,  infra).  Its 
value  is 

n  =  0 

the  summation  extending  to  all  n  +  1  classes  of  the  frequency 
distribution. 

Values  of  the  probability  that  an  equally  likely  or  less  likely 
system  of  deviations  will  occur,  usually  denoted  by  the  letter 
P,  have  been  computed  for  a  considerable  range  of  x"  ^^'^ 
7i'  =  n  4- 1  =  the  number  of  classes,  and  are  published  in  the  Tables 
for  Statisticians  and  Biometricians  mentioned  on  p.  358. 

The  arithmetical  process  is  illustrated  upon  the  two  examples 
of  dice-throwing  given  on  p.  258. 

There  are  three  points  whicli  the  student  sliould  note  as  regards 
the  practical  ap[;lication  of  the  method.  In  the  first  place,  the 
proof  given  assumes  that  deviations  from  the  expected  frequencies 
follow  the  normal  law.  This  is  a  reasonable  assumption  only  if 
no  theoretical  frequency  is  very  small,  for  if  it  is  very  small  the 
distribution  of  deviations  will  be  skew  and  not  normal.  It  is 
desirable,  therefore,  to  group  together  the  small  frequencies  in 
the  "  tail "  of  the  frequency  distribution,  as  is  done  in  the  second 
illustration  below,  so  as  to  make  the  expected  frequency  a  few 
units  at  least.  In  the  case  of  the  first  illustration  it  might  have 
been  better  to  group  the  frequency  of  0  successes  with  that  of 
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Twelve  Bice  thrown  4096  times,  a  throw  of  4,  5,  or  6  points  reckoned 
a  success  {p.  2f)8). 


No.  of 

Successes. 

Observed 
Frequenc}' 
(m'). 

Expected 
Frequency 
(m) 
4096(Hi)''. 

<t'  -mf. 

{m'  -mf 
m 

0 
1 
2 
3 
4 
5 
6 

8 
9 
10 
11 
12 

0 
7 
60 
198 
430 
731 
948 
847 
536 
257 
71 
11 
0 

1 
12 
66 
220 
495 
792 
924 
792 
495 
220 
66 
12 
1 

1 

25 
36 
484 
4225 
3721 
576 
3025 
1681 
1369 
25 
1 
1 

1-  0000 

2-  0833 
-5455 

2-  2000 
8-5354 
4-6982 

-6234 

3-  8194 
3-39^0 
6-22-27 

-3788 
•0833 
1-0000 

Totals 

4096 

4096 

34 -5860  =  x^ 

From  the  tables  we  find  : — 

n'. 
13 
13 

30  -002792 
40  -000072 

Hence,  by  interpolation  for  x 

-  =  34-5860,  P= 

-0015. 

Tioelve  Dice  thrown  4096  times,  a  throw  of  6  points  reckoned  a  success. 


No.  of 

Successes. 

Observed 
Frequency 
(m'). 

Expected 
Frequency 

4096(1 + 

{771'  -  mf. 

(m'  -  mf 
m 

0 
1 

2 
3 
4 
5 
6 

7  and  over 

447 
1145 
1181 
796 
380 
115 
24 
8 

459 
1103 
1213 
809 
364 
116 
27 
5 

144 
1764 
1024 
169 
256 
1 
9 
9 

•3137 
1-5993 
-8442 
•2089 
•7033 
•0086 
•3333 
1  -8000 

Totals 

4096 

4096 

5-8113  =  x^ 

From  the  tables  we  find  : — 

n'. 
8 

8 

x^-  P 

5  ^659963 

6  -539750 

Hence,  by  interpolation  for  x 

2=5-8113,  P= 

5624. 
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I  success,  and  the  frequency  of  12  successes  with  that  of  11 
successes. 

In  the  second  place,  the  proof  outlined  assumes  that  the 
theoretical  law  is  known  a  priori.  In  a  large  number,  perhaps 
almost  the  majority,  of  practical  cases  in  which  the  test  is  ap- 
plied this  condition  is  not  fulfilled.  We  determine,  for  example, 
the  constants  of  a  frequency  curve  from  the  observations  them- 
selves, not  from  a  priori  considerations :  we  determine  the 
"independence  values"  of  the  frequencies  for  a  contingency 
table  from  the  given  row  and  column  totals,  again  not  from 
a  priori  considerations.  This  general  case  is  dealt  with  below, 
in  the  section  headed  "  Comparison  Frequencies  based  on  the 
Observations." 

Finally,  attention  should  be  paid  to  the  run  of  the  signs  of 
the  differences  m—m.  The  method  used  pays  no  attention  to 
the  order  of  these  signs,  and  it  may  happen  that  ^  has  quite  a 
moderate  value  and  P  is  not  small  when  all  the  positive  differences 
are  on  one  side  of  the  mode  and  all  the  negative  differences  on  the 
other,  so  that  the  mean  shows  a  deviation  from  the  expected  value 
that  is  quite  outside  the  limits  of  sampling,  or  that  the  differences 
are  negative  in  both  tails  so  that  the  standard  deviation  shows 
an  almost  impossible  divergence  from  expectation.  In  the  first 
example  on  the  preceding  page  all  the  differences  are  negative  up  to 
5  successes,  positive  from  6  to  10  successes,  and  negative  again  for 

II  and  12  successes.  This  is  almost  the  first  case  supposed,  and 
in  fact  we  have  already  found  (p.  267)  that  the  mean  deviates 
from  the  expected  value  by  5*1  (more  precisely  513)  times  its  stan- 
dard error.    From  Table  II.  of  Tables  for  Statisticians  we  have  : — 

Greater  fraction  of  the  area  of  a  normal 

curve  for  a  deviation  5'13  .       .       .  -9999998551 

Area  in  the  tail  of  the  curve  .        .        .  -0000001449 

Area  in  both  tails   '0000002898 

so  that  the  probability  of  getting  such  a  deviation  ( -f  or  — )  on 
random  sampling  is  only  about  3  in  10,000,000.  The  value  found 
for  P  ('0015)  by  the  grouping  used  is  therefore  in  some  degree 
misleading.  If  we  regroup  the  distribution  according  to  the 
signs  of  m  —  m,  we  find 


Successes. 


Observed 
Frequency. 


Expected 
Frequency. 


0-  5 
6-10 
11-12 


14'26 
2659 
11 


1586 
2497 


13 


Total . 


4096 


4096 
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For  this  comparison  n  is  3,  ^-  is  26'96j  or  practically  27,  and  P 
is  about  '000001 — a  value  much  more  nearly  in  accordance  with 
that  suggested  by  the  mean. 

Such  a  regrouping  of  the  frequency  distribution  by  the  runs  of 
classes  that  are  in  excess  and  in  defect  of  expectation  would  appear 
often  to  afford  a  useful  and  severe  test  of  the  real  extent  of  agree- 
ment between  observation  and  theory.  In  the  second  example 
the  signs  are  fairly  well  scattered,  and  the  regrouping  has  a  com- 
paratively small  efifect ;  the  mean  being  in  almost  precise  agreement 
with  expectation.    The  regrouped  distribution  is  : — 


Successes. 

Observed 
Frequency. 

Expected 
Frequency. 

0 

447 

459 

1 

1145 

1103 

2-3 

1977 

2022 

4 

380 

364 

5-6 

139 

143 

7-8 

8 

5 

Total . 

4096 

4096 

Here  n'  is  6,  is  5*52,  and  P  0'36,  so  that  the  deviations  from 
expectation  are  still  well  within  the  range  of  fluctuations  of 
sampling. 

The  value  of  P  is  the  probability  that  a  set  of  observations 
will  occur  giving  a  group  of  deviations  from  theory,  i.e.  a  value 
of  which  is  more  improbable  than  that  observed.  If,  to  take 
the  second  illustration  above,  we  were  to  repeat  4096  throws  of 
twelve  dice  a  large  number  of  times,  noting  the  throws  of  sixes, 
we  should  expect  to  get  a  worse  fit  to  theory,  i.e.  a  value  of  x 
greater  than  5*81,  roughly  speaking  56  times  in  every  hundred 
trials. 

The  value  of  P  corresponding  to  X"""^  necessarily  unity, 
for  it  is  certain  that  all  values  of  ^  must  exceed  zero.  If  the 
value  of  P  corresponding  to  \  is  Pj,  then  1  -  Pj  is  the 
probability  of  values  of  ^  between  0  and  1.  Similarlj^  if  the 
value  of  P  corresponding  to  =  2  is  Pg,  then  the  probability  of 
values  of  ^  between  1  and  2  is  Pj  -  Pg,  and  so  on.  Thus,  for 
16  classes  \n  =  16),  we  find  in  the  tables : — 
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A.  • 

r. 

Dififerences  of  P. 

0 

1- 

•007  873 

5 

•992  127 

•172  388 

]0 

•819  739 

•368  321 

15 

•451  418 

•279  486 

20 

•171  932 

•171  932 

We  should  expect,  therefore,  in,  say,  1000  sets  of  random 
samplmg  with  16  classes,  about  8  cases  of  -^^  between  0  and  5, 
about  172  cases  between  5  and  10,  368  between  10  and  15, 
279  between  15  and  20,  and  172  over  20.  The  following  table 
shows  the  results  obtained  for  the  more  modest  number  of  100 
sets  of  trials,  and  gives  very  fair  agreement  with  theory,  especially 
considering  that  the  assum_ption  of  normality  can  hardly  bo 
strictly  true.  The  trials  were  carried  out  by  throwing  200 
beans  into  a  revolving  circular  tray  with  sixteen  equal  radial 
compartments,  and  counting  the  number  of  beans  in  each  com- 
partment. The  value  of  "^^^s  then  computed,  taking  the 
expected  frequency  as  200/16  =  12"5. 


Number  of  Trials  giving  a  Value  of 
lying  between  tbe  Limits  on  the  Left. 

Expected. 

■ 

Observed. 

0-  5 

08 

5-10 

17-2 

20 

10-15 

36-8 

36 

15-20 

27-9 

30-5 

20  upwards 

17-2 

13-5 

If  w^e  treat  this  in  its  turn  as  a  comparison  of  observation  with 
theory,  we  find,  bracketing  the  first  two  groups  together,  so  as 
to  reduce  the  number  of  classes  to  four,  '^  =  \''2^,  whence  from 
the  tables  P  is  approximately  0'74.  That  is  to  say,  we  should 
expect  a  worse  agreement  with  theory  about  three  times  out 
of  four. 

It  follows  from  what  was  said  above  that,  in  any  series  of  trials 
by  simple  sampling,  equal  numbers  of  cases  should  be  found  within 
equal  intervals  of  P,  e.rj.  from  10  to  0*9,  from  0^9  to  0-8,  from 
0"8  to  0"7,  and  so  on.    The  frequency  distribution  of  P,  that  is  to 
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say,  when  we  fulfil  the  conditions  of  simple  sampling,  is  uniform 
over  the  whole  range  from  0  to  1.  Thus  for  a  rough  grouping 
into  four  classes  the  above  series  of  trials  gave : — 


P. 

Number  of  T)ia1s  giving  a  Value  of  P 
lying  between  the  Limits  on  the  Left. 

Expected. 

Observed. 

1-00-0  75 

25 

23 

075-0-50 

25 

30 

0-50-0'25 

25 

22 

0-25-0 

25 

25 

The  value  of  x'  ^^r  this  comparison  is  1*52,  giving  P=0'68,  or 
we  should  expect  a  worse  fit  roughly  twice  in  every  three  trials. 

COMPARISON  FREQUENCIES  BASED  ON  THE 
OBSERVATIONS. 

Contingency  Tables. — Attention  was  specially  directed  above 
to  the  fact  that  the  theoretical  frequencies  were  assumed  to  be 
given  a  priori.  The  theory  of  the  more  general  case,  in  which 
comparison  is  made  with  frequencies  determined  by  the  aid  of  the 
observations  themselves,  has  only  recently  been  fully  worked  out 
(Fisher,  ref.  118).  The  most  important  practical  case  of  the 
kind  is  that  of  association  or  contingency  tables  in  which  the 
observed  frequencies  are  compared  with  the  independence-values 
obtained  from  the  totals  of  rows  and  columns — that  is,  the  values 

(A    B)  (^rn){Bn) 
m^n)o  =  

of  Chapter  V.  §  6,  p.  64,  and  in  which  the  differences 

Smn=(^m^n)-(^m^n)o 
are  used  as  an  indication  of  the  divergence  from  independence. 
The  rule  to  which  the  theory  leads  is  a  very  simple  one :  the 
method  is  still  applicable,  but  the  tables  must  bo  entered  with  n' 
equal  to  the  number  of  algebraically  independent  frequencies  (or 
values  of  8)  increased  by  unity,  and  not  with  7i'  equal  to  the 
number  of  compartments  in  the  table.  Now,  if  in  any  column 
of  the  contingency  table  we  are  given  all  the  values  of  B  but  one — 
say,  the  marginal  value  at  the  bottom, — the  remaining  one  can  be 
determined,  because  the  sum  of  the  8  s  for  every  column  must  be 
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aero.  The  same  statement  must  hold  good  for  every  row.  Hence, 
if  r  be  the  number  of  rows,  c  the  number  of  columns,  the  number 
of  algebraically  independent  values  of  8  is  (r— l)(c— 1),  and  the 
tables  must  be  entered  with  the  value 

7i'  =  (r-l)(c-l)  +  l. 

The  student  will  realise  that  this  is  a  reasonable  rule  if  he 
considers  that  when  we  take  n  as  the  number  of  classes,  the 
comparison  frequencies  being  given  a  priori,  we  are  taking  it  as 
one  more  than  the  number  of  algebraically  independent  frequencies, 
since  the  total  number  of  observations  is  fixed. 

The  following  will  serve  as  an  illustration  (Yule,  ref.  5  of 
Chapter  V.).  Sixteen  pieces  of  photographic  paper  were  printed 
down  to  different  depths  of  colour  from  nearly  white  to  a  very 
deep  blackish  brown.  Small  scraps  were  cut  from  each  sheet  and 
pasted  on  cards,  two  scraps  on  each  card  one  above  the  other, 
combining  scraps  from  the  several  sheets  in  all  possible  ways,  so 
that  there  were  256  cards  in  the  pack.  Twenty  observers  then 
went  through  the  pack  independently,  each  one  naming  each  tint 
either  "light,"  "medium,"  or  "dark." 

Tablk  showing  the  Name  {light,  medium,  or  dark)  at,v'gned  to  each  of  two 
Pieces  of  Photographic  Paper  on  a  Card:  256  Cards  and  20  Observers. 
Upper  figure,  observed  frequency  ;  central  figure,  independence  frequency  ; 
bottom  figure,  difference  S.    (Yule,  ref.  5  of  Chap.  V.,  Table  XXL) 


Name  assigned  to 
Lower  Tint  on 
Card. 

Name  assigned  to  Upper  Tint  on  Card. 

Total. 

Light. 

Medium. 

Dark. 

Light      .       .  1 

850 
785 

+  65 

571 
633 

-62 

580 
583 
-  3 

2001 

Medium  .       . -| 

618 
653 

-35 

593 
527 

+  66 

455 
486 

-31 

1666 

Dark       .       .  | 

540 
570 

-30 

456 
460 

-  4 

457 
423 

+  34 

1453 

Total 

2008 

1620 

1492 

5120 
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4225/785  . 

.  5*38 

3844/633  . 

.  6-07 

9/583  . 

•02 

1225/653  . 

.  1-88 

4356/527  . 

.    8  27 

961/486  . 

.  198 

900/570  . 

.  1"58 

16/460  . 

•03 

1156/423  . 

.  2-73 

Total 

27-94 

n' 

5 

P 

•000012 

The  results  are  shown  in  the  preceding  table,  the  upper  figure  in 
each  compartment  of  the  table  being  the  observed  frequency  of 
the  corresponding  pair  of  names.  Below  the  observed  frequency 
are  given  the  independence  frequency  {A^B^^  and  the  difference 
8^„.  It  will  be  seen  that  the  observed  figures  are  not  very  close 
to  the  independence-values,  there  being  apparently  a  marked 
tendency  to  give  the  same  names  to  the  two  tints  on  any  card,  so 
that  all  the  diagonal  frequencies  are  in  excess  of  the  independence- 
values  and  all  the  others  in  defect. 

Working  out  ^  as  shown,  the  total  comes  to  27*94,  or  practically 
28.  Since  r  and  c  are  both  3,  n'  must  be  taken  as  (2  x  2)  -I- 1 — 
that  is,  5.  Turning  up  the  tables  in  the  column  7i'  =  5,  we  find 
F=  •000012 — that  is  to  say,  we  would  only  expect  to  find  so  great 
a  divergence  from  independence,  in  random  sampling,  a  little 
more  than  once  in  100,000  trials,  so  the  result  is  certainly 
significant. 

Association  Tables. — When  we  are  dealing  with  an  association 
table  there  are  only  two  rows  and  two  columns,  and  consequently 
n  must  be  taken  as  (2  -  ])(2  -  1)4-1 — that  is,  2.  But  no  column 
for  =  2  is  given  in  Tables  for  Statisticians  and  Biometricians,  the 
lowest  value  taken  being  n  =  3,  and  a  supplementary  table  (XV.  c) 
is  not  sufficiently  detailed :  the  necessary  table,  reprinted  by 
permission  from  the  Journal  of  the  Royal  Statistical  Society 
(ref.  119),  will  be  found  at  the  end  of  this  Supplement.  As  will 
be  seen  from  the  following  illustrations,  the  required  probability 
can  also  be  determined  from  the  table  of  areas  of  the  normal 
curve,  but  it  ia  very  convenient  to  keep  the  arithmetic  in  the 
usual  form. 

Example  i. — (Data  from  Chapter  III.,  p.  37.)  The  following 
data  are  there  cited  for  colour  of  flower  and  prickliness  of  fruit  in 
Datura :  the  independence-frequencies  have  been  entered  below 
the  numbers  of  observations. 
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Fruit. 


Flower. 


Total. 


Prickly. 


Smooth. 


Violet 


White 


/ 
I 

i 


47 

48  337 
21 

19-663 


12 

10-663 
3 

4-337 


59 


Total  . 


63 


15 


83 


Here  6  is  1-337,  and 


Turning  up  this  value  of  x"-  in  the  table  on  p.  3S.!>,  we  find  by 
interpolation  P=-400,  As  stated  in  the  text,  the  association, 
negative  in  this  case,  is  "so  small  that  no  stress  can  be  laid  on  it 
as  indicating  anything  but  a  fluctuation  of  sampling." 

Precisely  the  same  result  can  be  arrived  at  by  working  out  the 
standard  error  of  the  ditference  between  the  proportions  of  violet 
and  of  white  flowers  that  have  smooth  fruits,  taking  the  ratio  of 
the  difference  to  its  standard  error  and  then  using  the  table  of 
areas  of  the  normal  curve.    Thus  : — 

Proportion  of  violet  flowers  that  have  smooth 

fruits,  12/59  or  -2033 

Proportion  of  white  flowers  that  have  smooth 

fruits,  3/24  or  '1250 

Difference   0783 

Proportion  of  all  flowers  tliat  have  smooth  fruits, 

15/83  or  -1807 

Standard  error  of  the  difterence  between  proportions  of  smooth 
fruits  in  simpling  from  a  universe  in  which  the  proportions  are 
•1807  and  -8193,  and  the  numbers  in  the  samples  59  and  24 
respectively  : — 


Hence  tlie  ratio  of  the  observed  difference  to  its  standard  error  is 
•0783/  0932  or  -840. 
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Interpolating  in  the  table  of  areas  of  the  normal  curve  on 
p.  310,  or  taking  the  required  figure  directly  from  Table  II.  of 
Tables  for  Statisticians,  we  have:  — 

Greater  fraction  of  area  for  a  deviation  of  •84  in 

the  normal  curve  ......  '7995 

Area  in  the  tail        ......  '2005 

Area  in  both  tails     ......  "401 

That  is  to  say,  the  probability  of  getting  a  diflference,  of  either 
sign,  as  great  as  or  greater  than  that  actually  observed  is  '401, 
agreeing,  within  the  accuracy  of  the  arithmetic,  with  the 
probability  given  by  the  method. 

The  same  result  would  again  have  been  obtained  had  we  worked 
from  the  columns  instead  of  from  the  rows,  and  considered  the 
difference  between  the  proportions  of  white  flowers  for  prickly  and 
for  smooth  fruits  respectively. 

Example  ii.— (Data  from  ref.  6  of  Chapter  III,  Table  XIV.) 
The  following  table  shows  the  result  of  inoculation  against  cholera 
on  a  certain  tea  estate  : — 


Not-attacked. 

Attacked. 

Total. 

Inoculated  .       ,  .| 
Not-inoculated    .  .| 

431 
42;  7 
291 
294-3 

5 

8-3 
9 

57 

436 
300 

Total  .... 

722 

14 

736 

As  in  the  last  example,  the  independence-frequencies  have  been 
given  below  the  numbers  observed.    The  value  of  3  is  3-3,  and 

From  the  table  on  p.  389  P  is  '0706. 

Working  from  the  proportions  attacked,  we  can  arrive  at  the 
same  result. 

Proportion  attacked  amongst       inoculated   .        .  -01147 
,,  ,,  ,,       not-inoculated   .       .  '03000 

Difference  '01853 

The  standard  error  of  the  difference  is 

A/-98098  x-01902('-i^  +  -lV  01025. 
>  \4o6  300/ 
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The  ratio  of  the  difference  to  its  standard  error  is  therefore 
•01853/  01025,  or  1-808. 

Greater  fraction  of  normal  curve  for  a  deviation  of  1-808  is  '96470 

Fraction  in  tail  ,  -03530 

Fraction  in  the  two  tails        ......  '07060 

As  before,  both  methods  must  lead  to  the  same  result. 

An  Aggregate  of  Tables. — It  may  often  happen  that  we  have 
formed  a  number  of  contingency  or  association  tables — more 
often  the  latter  than  the  former — for  similar  data  from  different 
fields.  All  may  give,  perhaps,  a  positive  association,  but  the 
values  of  P  may  run  so  high  that  we  do  not  feel  any  great  con- 
fidence even  in  the  aggregate  result.  The  question  then  arises 
whether  we  cannot  obtain  a  single  value  of  P  for  the  aggregate  as 
a  whole,  telling  us  what  is  the  probability  of  getting  by  mere 
random  sampling  a  series  of  divergences  from  independence  as 
great  as  or  greater  than  those  observed.  The  question  is  usually 
answered  by  pooling  the  tables ;  but,  in  view  of  the  fallacies  that 
may  be  introduced  by  pooling  (c/.  Chapter  IV.  §§  6  and  7),  this 
method  is  not  quite  satisfactory.  A  better  answer  is  given  by  the 
application  of  the  present  general  rule.  Add  up  all  the  values  of 
^  for  the  different  tables,  thus  obtaining  the  value  of  for  the 
aggregate,  and  enter  the  P-tables  with  a  value  of  n'  equal  to  the 
total  of  algebraically  independent  frequencies  increased  by  unity  : 
that  is,  take  n'  as  given  by 

7i'=l  +  ^(r-l)(c-l). 

For  the  association  table  there  is  only  one  algebraically  inde- 
pendent value  of  S.  Hence  if  we  are  testing  the  divergence  from 
independence  of  an  aggregate  of  association  tables,  we  must  add 
together  the  values  of  and  enter  the  P-tables  with  n  taken  as 
one  more  than  the  number  of  tables  in  the  aggregate. 

Thus  from  ref.  6  of  Chapter  III.,  from  which  the  data  of 
Example  ii.  were  cited,  we  take  the  following  values  of  ^  and  of 
P  for  six  tables  that  include  that  example.  They  refer  to  six 
different  estates  in  the  same  group. 


P. 

9-34 

-0022 

6-08 

•014 

2-51 

•11 

3-27 

•071 

5-61 

•018 

1-59 

•21 

Total    .       .  28^40 
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The  association  between  inoculation  and  protection  from  attack 
is  positive  for  each  estate,  but  for  only  one  of  the  tables  is  the 
value  of  P  so  small  that  we  can  say  the  result  is  very  unlikely  to 
have  arisen  as  a  fluctuation  of  sampling.  Adding  up  the  values 
of  x^  the  total  is  28*40,  and  entering  the  column  for  n  =  7  (one 
more  than  the  number  of  tables  considered),  we  find 

x^.  P. 

28  -000094 

29  -000061 

whence  by  interpolation  the  value  of  P  is  -000081,  i.e.  we  should 
only  expect  to  get  a  total  of  y^'s  as  great  as  or  greater  than  this,  on 
random  sampling,  81  times  in  1,000,000  trials.  We  can  therefore 
regard  the  results  as  significant  with  a  high  degree  of  confidence. 

We  may,  I  think,  go  further :  for  all  the  observed  associations 
are  positive,  and  in  six  cases  there  are  2^  or  64  possible  permuta- 
tions of  sign.  We  should  therefore  only  expect  to  get  an  equal 
or  greater  total  value  of  and  tables  all  showing  positive  associa- 
tion, not  81  times  in  1,000,000  trials  but  81/64  or,  roundly,  1-3 
times.    P  iov  thoi  observed  event  =  2S-4  and  all  associations 

positive)  is  therefore  only  "0000013. 

Experimental  Illustrations  of  the  General  Case. — The  formulae 
for  the  general  case,  as  for  the  special  case  in  which  the  frequencies 
with  which  comparison  is  made  are  given  a  priori,  can  be  checked 
by  experiment. 

The  numbers  of  beans  counted  in  each  of  the  sixteen  compart- 
ments of  the  revolving  circular  tray  mentioned  on  p.  377  above 
were  entered  as  the  frequencies  of  a  table  (1)  with  4  rows  and 
4  columns,  (2)  with  2  rows  and  8  columns,  and  the  value  of 
computed  for  each  table  for  divergence  from  independence.  For 
the  two  cases  we  have 

w'  =  (3x3)  +  l  =  10 
and  w=(lx7)-fl  =  8 

respectively.  Difi"erencing  the  columns  for  P  corresponding  to 
these  two  values  of  n',  we  obtain  the  theoretical  frequency-distri- 
butions given  in  the  columns  headed  "Expectation"  in  Table  A. 
The  observed  distributions  of  the  values  of  in  100  experimental 
tables  are  given  in  the  columns  headed  "  Observation."  It  will  be 
seen  that  the  agreement  between  exj  ectation  and  observation  is 
excellent  for  so  small  a  number  of  observations.  If  the  goodness 
of  fit  be  tested  by  the  ^  method,  grouping  together  the  frequencies 
from  x^  =  15  upwards,  so  that  n  is  4,  is  found  to  be  2-27  for 
the  4  X  4  tables  and  4  36  for  the  2x8  tables,  giving  P  =  0-52  in 
the  first  case  and  0-22  in  the  second. 
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Table  k.  — Theoretical  Distribution  of-)^,  calculated  from  Independence-values, 
in  Tables  with  16  Compartments,  compared  with  the  Actual  Distributions 
given  by  100  Experimental  Tables.  In  the  first  case  n'  must  be  taken  as 
10,  in  the  second  as  8.    (Ref.  119.) 


4  Rows,  4  Columns. 

2  Rows,  8  Columns. 

Expectation. 

Observation. 

Expectation. 

Observation. 

0-  5 

16-6 

17 

340 

29-5 

5-10 

48-4 

44 

47  1 

56-5 

10-15 

26  0 

32 

15-3 

10 

15-20 

7-3 

6 

3-0 

3 

20- 

1-8 

1 

0  6 

1 

Total 

100-1 

100 

1000 

100 

For  tables  with  2  rows  and  2  columns  350  experimental  tables  of 
100  observations  each  were  available.  The  observed  distribution  of 
values  of  calculated  from  the  independence  frequencies,  is  shown 
in  Table  B,  together  with  the  theoretical  distribution  obtained  by 
differencing  the  table  on  pp.  385-386.  Testing  goodness  of  fit  on 
Table  B  as  it  stands,  n'  is  10,     works  out  at  7 '53,  and  P  is  0*583. 

Table  B.  —  Theoretical  Distribution  of  for  a  Table  with  2  Rows  and  2 
Columns,  when  ^  is  calculated  from  the  Independence-values,  compared 
with  the  Actual  Results  for  350  Experimental  Tables.    (Ref.  119.) 


Number  of  Tables. 

Value  of  x^. 

Expected. 

Observed. 

0  -0-25 

134-02 

122 

0-25-0-50 

48-15 

54 

0-50-0-75 

32-56 

41 

0-75-100 

24.21 

24 

1  -2 

56-00 

62 

2  -3 

25-91 

18 

3  -4 

13-22 

13 

4  -5 

7-05 

6 

5  -6 

3-86 

5 

6- 

501 

5 

Total 

349-99 

350 

25 
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The  theorem  last  given  for  evaluating  P  for  an  aggregate  of 
tables  is  illustrated  by  the  experimental  data  of  Tables  C  and  D. 
The  values  of  for  the  350  fourfold  tables  of  Table  B  were 
added  together  in  pairs,  giving  175  pairs.  According  to  theory 
the  resulting  frequency-distribution  for  the  totals  of  pairs  of  x^'s 
should  be  given  by  differencing  the  column  of  the  P-table  for 
n  =  3.  The  results  of  theory  and  observation  are  compared  in 
the  first  pair  of  columns  of  Table  C.  Testing  goodness  of  fit, 
grouping  the  values  of  -^^  7  and  upwards,  n  is  8,  is  5*53.  and 
P  is  0-60. 

Grouping  the  values  of  for  the  350  experimental  tables 
similarly  in  sets  of  three  and  summing,  we  get  the  observed 
distribution  on  the  right  of  Table  C,  and  the  theoretical  distribu- 
tion by  differencing  the  column  of  the  P-table  for  n  =  4. 
Grouping  values  of  8  and  upwards,  and  testing  goodness  of  fit 
between  theory  and  observation,  n  is  9,      is  2*18,  and  P  0-97. 

Table  C. — Theoretical  Distribution  of  Totals  of  (calculated  from  Independ- 
ence-values) for  Pairs  and  for  Se^s  of  Three  Tables  with  2  Hows  and  2 
Columns,  compared  with  the  Actual  Distributions  given  by  Experimental 
Tables,    n'  must  be  taken  as  3  in  the  first  case,  and  4  in  the  second. 


Pairs  of  Tables. 

Sets  of  3  Tables. 

Sum  of 

Expectation. 

Observation. 

Expectation. 

Observation. 

0-1 

68-9 

67 

23-1 

21 

1-2 

41-8 

46 

26-5 

26 

2-3 

25-3 

22 

21-0 

22 

3-4 

15-4 

19 

15-1 

19 

4-5 

9-3 

7 

10-4 

9 

5-6 

5-6 

3 

7-0 

7 

6-7 

3-4 

6 

4-6 

4 

7-8 

2-1 

3 

3-0 

4 

8- 

3-2 

2 

5-3 

4 

Total 

1750 

175 

116-0 

116 

Table  D  makes  a  similar  comparison  for  the  values  of  x^, 
calculated  from  independence,  for  100  pairs  of  4x4  tables. 
Here  there  are  9  algebraically  independent  S's  for  each  table  of 
the  pair,  and  consequently  7i  must  be  taken  as  19.  Differencing 
the  P-table  for  ri  =  19,  the  expected  distribution  is  obtained,  which 
is  shown  in  the  first  column  of  Table  D,  the  observed  distribution 
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being  given  in  the  second  column.  Taking  the  two  groups  at  the 
bottom  of  the  table  together  and  testing  goodness  of  fit,  -^^  is 
found  to  be  4*11,  ii  is  5,  and  P  is  0-39. 

Table  D,  —  Theoretical  Distribution  of  Totals  of  {calculated  from  Independ- 
ence-values) for  Pairs  of  Tables  with  4  Rows  and  4  Columns,  compared  with 
the  Actual  Distribution  given  by  Experimental  Tables. 


Sum  of  two 

Expectation. 

Observation. 

0-10 

6-8 

8 

10-15 

27-0 

27 

15-20 

32-9 

31 

20-25 

20-8 

27 

25-30 

8-8 

6 

30- 

3-7 

1 

Total  . 

100-0 

100 

The  general  theorem  that  n  must  be  taken  equal  to  the  number 
of  algebraically  independent  frequencies  increased  by  unity  applies 
not  only  to  association  and  contingency  tables,  but  to  all  cases  in 
which  the  frequencies  observed  are  connected  with  those  expected 
by  a  number  of  linear  relations,  beyond  their  restriction  to  the 
same  total  frequency  (Fisher,  ref.  118).  Thus,  if  a  frequency 
curve  has  been  fitted  by  the  mean  and  standard  deviation,  n' 
should  be  taken  as  2  less  than  the  number  of  classes  :  if  it  has 
been  fitted  by  the  first  four  moments,  n  should  be  taken  as  four 
less  than  the  number  of  classes. 
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Table  of  the  Values  of  P  for  Divergence  from  Independence  in  the 
Fourfold  Table. 

A.— x^  =  0  to  x''^=l  by  steps  of  O'Ol. 


X 

P 

X 

P 

0 

1 '00000 

7966 

0*50 

0-4?950 

436 

0-01 

0*92034 

3280 

051 

0-47514 

430 

0-02 

0-88754 

2505 

0*52 

0-47084 

423 

0  03 

0-86249 

2101 

0  53 

0-46661 

418 

0'04 

0-84148 

1842 

0*54 

0*46243 

411 

005 

0  82306 

1656 

0*55 

0*45832 

406 

0*06 

0-80650 

1516 

0-56 

0*45426 

400 

0*07 

0'79134 

1404 

0  57 

0*45026 

395 

0*08 

0*77730 

1312 

0*58 

0*44631 

389 

0  09 

0-76418 

1235 

0*59 

0-44242 

384 

010 

0*75183 

1169 

0*60 

0*43858 

379 

0-11 

0*74014 

1111 

0'61 

0*43479 

374 

012 

0*72903 

1060 

0-62 

0*43105 

369 

013 

071843 

1015 

0-63 

0*42736 

365 

014 

0  70828 

974 

064 

0*42371 

360 

015 

0*69854 

938 

0*65 

0*42011 

355 

0*16 

068916 

905 

0*66 

0*41656 

351 

0'17 

0*68011 

874 

0*67 

0*41305 

346 

0-18 

0*67137 

845 

0*68 

0*40959 

343 

0*19 

0*66292 

820 

0-69 

0*40616 

338 

0*20 

0  65472 

795 

0-70 

0-40278 

334 

0"21 

0*64677 

773 

0*71 

0*39944 

330 

0*22 

0*63904 

752 

0-72 

0*39614 

326 

023 

0*63152 

731 

0*73 

0*39288 

322 

0"24 

0  62421 

713 

0*74 

0*38966 

318 

0"25 

0*61708 

696 

0-75 

0*38648 

315 

0"26 

0  61012 

679 

0-76 

0*38333 

311 

0*27 

0  60333 

663 

0*77 

0*38022 

308 

028 

0*59670 

648 

0-78 

0-37714 

304 

0*29 

0*59022 

634 

0*79 

0*37410 

301 

0'30 

0  58388 

620 

0-80 

0-37109 

297 

031 

0*57768 

607 

0*81 

0*36812 

294 

0'32 

0  57161 

595 

0*82 

0*36518 

291 

0*33 

0  56566 

583 

0*83 

0-36227 

287 

034 

0*55983 

572 

0-84 

0-35940 

285 

035 

055411 

560 

0*85 

0-35655 

281 

0*36 

0-54851 

551 

0*86 

0  35374 

278 

0  37 

0*54300 

540 

0-87 

0-3:<096 

276 

0'38 

0-53760 

\J        O  <  \J\J 

530 

088 

034820 

272 

0  39 

0  53230 

521 

0-89 

0*34548 

270 

040 

0*52709 

512 

0  90 

0-34278 

267 

0-41 

0*52197 

503 

0  91 

0-34011 

264 

0-42 

0  51694 

495 

0  92 

0  33747 

261 

0-43 

0-51199 

487 

0*93 

033486 

258 

0-44 

0*50712 

479 

0*94 

0-33228 

256 

0-45 

0*50233 

471 

0-95 

0  32972 

253 

0  46 

0-49762 

463 

0*96 

0*32719 

251 

0-47 

0*49299 

457 

0-97 

0  32468 

248 

0-48 

0  48842 

449 

0-98 

0  32220 

246 

0-49 

0-48393 

443 

0-99 

0*31974 

243 

0-50 

0-47950 

436 

1*00 

0-31731 

241 
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B.— x^^l  to  x^=10  by  steps  of  0-1. 


P 

A 

P 

A 

ro 

0-31731 

2304 

5  5 

0-01902 

106 

11 

0-29427 

2095 

5-6 

0-01796 

99 

1-2 

0-27332 

1911 

5-7 

0-01697 

94 

1-3 

0  25421 

1749 

5-8 

O-O1603 

89 

1-4 

0-23672 

160.-. 

5-9 

0-01514 

83 

15 

0-22067 

1477 

6-0 

0-01431 

79 

16 

0-20590 

1361 

61 

0-01352 

74 

1-7 

0-1922i> 

1258 

6-2 

0-01278 

71 

1-8 

0-17971 

1163 

6-3 

0-01207 

66 

1-9 

0-16808 

1078 

6  4 

0-01141 

62 

2-0 

0-15730 

1000 

6-5 

0-01079 

59 

2-1 

0-14730 

929 

6-6 

0-01020 

56 

2-2 

0-13801 

864 

6  7 

0-00964 

52 

2-3 

0-12937 

803 

6-8 

0-00912 

50 

2  4 

0-12134 

749 

6  9 

0-00862 

47 

2-5 

0-11385 

699 

7-0 

0-00815 

44 

2-6 

0-10686 

651 

7-1 

0-00771 

42 

27 

0-10035 

609 

7-2 

0-00729 

39 

2-8 

0  09426 

568 

7  3 

0  00690 

38 

2-9 

0-08858 

532 

7-4 

0-00652 

35 

3-0 

0-08326 

497 

7-5 

0-00617 

33 

3-1 

0-07829 

465 

7-6 

0-00584 

32 

3-2 

0-07364 

436 

7-7 

0-00552 

30 

3-3 

0-06928 

408 

7-8 

0-00522 

28 

3-4 

0-06520 

383 

7  9 

0  00494 

26 

3o 

0-06137 

359 

8-0 

0  00468 

25 

3-6 

0  05778 

337 

8-1 

0-00443 

24 

3-7 

0-05441 

316 

8-2 

0-00419 

23 

3-8 

0-05125 

296 

8-3 

0-00396 

21 

3-9 

0-04829 

279 

8-4 

0-00375 

20 

4-0 

0  04550 

262 

8-5 

0-00355 

19 

4-1 

0-04288 

246 

8-6 

0-00336 

18 

4-2 

0-04042 

231 

8-7 

0-00318 

17 

4-3 

0-03811 

217 

8-8 

0  00301 

16 

4-4 

0-03594 

205 

8-9 

0-00285 

15 

4-5 

0-03389 

192 

9-0 

0-00270 

14 

46 

0  03197 

181 

9-1 

0-00256 

14 

4-7 

0-03016 

170 

9-2 

0  00242 

13 

4-8 

0-02846 

160 

9-3 

0-00229 

12 

4"9 

0  02686 

151 

9-4 

0-00217 

12 

50 

0-02535 

142 

9  5 

0-00205 

10 

5-1 

0-02393 

134 

9-6 

0-00195 

11 

5-2 

0-02259 

126 

9-7 

0-00184 

10 

5  3 

0  021 33 

119 

9-8 

0-00174 

9 

5-4 

0-02014 

112 

9-9 

0-00165 

8 

5-5 

0-01902 

106 

10-0 

0-00157 

8 

For  values  of  P  corresponding  to  x^  "=  11  to  x^  =  30,  by  units,  see  Table  XV.  (c), 
p.  30  of  Tables  for  Statisticians  and  Biometriciavs. 


390 


THEORY  OF  STATISTICS. 


ADDITIONAL  REFERENCES. 
History  of  Statistics  (p.  6). 

(1)  KoEEN,  J.  (edited  by),  The  History  of  Statistics,  their  Development  and 

Progress  in  many  Countries,  New  York,  The  MacmiUan  Co.,  1918. 
(A  collection  of  articles,  mainly  on  the  progress  of  official  statistics, 
written  by  a  specialist  for  each  country.) 

(2)  Walker,  Helen  M.,  Studies  in  the  History  of  Statistical  Method, 

Baltimore,  Williams  &  Wilkins  Co.,  1929.  (Most  detailed  on  recent 
history :  chapters  on  the  Normal  Curve,  Moments,  Percentiles, 
Correlation,  Spearman's  theory  of  Two  Factors  for  Intelligence, 
Statistics  as  a  Subject  of  Instruction  in  American  Universities,  and 
the  Origin  of  certain  Technical  Terms.    Useful  bibliographies.) 

(3)  HoTELLiNG,  H,,  "  British  Statistics  and  Statisticians  Today,"  Jour. 

Amer.  Stat.  Assoc.,  vol.  xxv.,  1930,  p.  186. 

Contingency  (p.  73). 

(4)  Pearson,  Karl,  "  On  the  Measurement  of  the  Influence  of  Broad 

Categories  on  Correlation,"  Biometrika,  vol.  ix.,  1913,  p.  116. 

(5)  Pearson,  Karl,  "  On  the  General  Theory  of  Multiple  Contingency  with 

Special  Reference  to  Partial  Contingency,"  Biometrika,  vol.  xi.,  1916, 
p.  145.  (An  extension  of  the  method  of  contingency  coefScients  to 
classification  subjected  to  various  conditions  ;  arithmetical  examples 
are  provided  in  the  undermentioned  paper.) 

(6)  Pearson,  Karl,  and  J.  F.  Tocher,  "  On  Criteria  for  the  Existence  of 

Differential  Death-Rates,"  Biometrika,  vol.  xi.,  1916,  p.  159. 

(7)  Ritchie-Scott,  A.,  "  The  Correlation  Coefficient  of  a  Polychoric  Table," 

Biometrika,  vol.  xii.,  1918,  p.  93,  (Considers  various  methods  of  meas- 
uring association  with  special  reference  to  4  x  3-fold  classification^.) 

(8)  Pearson,  Karl,  and  E.  S.  Pearson,  "  On  Polychoric  Coefficients  of 

Correlation,"  Biometrika,  vol.  xiv.,  1922,  p.  127. 

The  Mode  (p.  130). 

(9)  DooDSON,  Arthur  T.,  "  Relation  of  the  Mode,  Median  and  Mean,  in 

Frequency  Curves,"  Biometrika,  vol.  xi.,  1916-17,  p.  429.  (Gives  a 
proof  of  the  relation  noted  on  p.  121.) 

Index-numbers  (p.  130). 

There  are  useful  discussions  as  to  method  in  the  following  : — 

(10)  Knibbs,  G.  H.,  "  Prices,  Price-Indexes,  and  Cost  of  Living  in  Australia," 

Commonwealth  of  Australia,  Labour  a?id  Industrial  Branch,  Report 
No.  1,  1912. 

(11)  Wood,  Frances,  "The  Course  of  Real  Wages  in  London,  1900-12," 

Jour.  Boy.  Stat.  Soc.,  vol.  Ixxvii.,  1913-14,  p.  1. 

(12)  Working  Classes,  Cost  of  Living  Committee,  1918,  Report  (Cd. 

8980,  1918),  H.M.  Stationery  Office. 

(13)  Bowley,  a.  L.,  "  The  Measurement  of  Changes  in  Cost  of  Living," 

Jour.  Roy.  Stat.  Soc,  vol.  Ixxxii.,  1919,  p.  343. 

(14)  Bennett,  T.  L.,  "  The  Theor}^  of  Measurement  of  Changes  in  the  Cost 

of  Living,"  Jour.  Roy.  Stat.  Soc,  vol.  Ixxxiii.,  1920,  p.  455. 

(15)  Flux,  A.  W.,  "  The  Measurement  of  Price  Changes,"  Jour.  Roy.  Stat. 

Soc,  vol.  Ixxxiv.,  1921,  p.  167. 

(16)  Fisher,  Irving,  "  The  Best  Form  of  Index-number,"  Quart.  Pub. 

Amer.  Stat.  Assoc,  March  1921,  p.  533. 


SUPPLEMENTS — ADDITIONAL  REFERENCES.  391 

(17)  Persons,  W.  M.,  "  Fisher's  Formula  for  Index-numbers,"  Rev.  Econ. 

Statistics,  vol.  iii.,  1921,  p.  103. 

(18)  March,  L.,  "  Les  modes  de  mesure  du  mouvement  general  des  prix," 

Metron,  vol.  i..  No.  4,  1921,  p.  40. 

(19)  Fisher,  Irving,  The  Making  of  Index-numbers,  Houghton  Mifflin  Co., 

Boston  and  New  York,  1922.  (Useful  as  a  repertory  of  formulae,  with 
tests  of  the  results  given  on  certain  American  data  ;  otherwise,  c/. 
reviews  in  Economic  Journnl,  vol,  xxxiii.,  p.  90  and  p.  246,  and 
Jour.  Roy.  Slat.  Soc,  vol.  Ixxxvi.,  p.  424,  and  vol.  Ixxxvii.,  p.  89.) 

(20)  Marshall,  A.,  Money,  Credit,  and  Commerce,  Macmillan,  London,  1923. 

For  the  student  of  the  cost  of  living  in  Great  Britain  the  following 
are  useful : — 

(21)  "  Labour  Gazette  Index  Number  :  Scope  and  Method  of  Compilation," 

Lab.  Gaz.,  March  1920  and  Feb.  1921. 

(22)  "  Final  Report  on  the  Cost  of  Living  of  the  Parliamentary  Committee 

of  the  Trades  Union  Congress  "  (The  Committee,  32  Eccleston  Sq., 
London,  1921);  critical  notices  of  the  same  in  the  Labour  Gazette, 
Aug.  and  Sept.  I92I ;  and  review  by  A.  L.  Bowley,  Econ.  Jour., 
Sept.  1921. 

(23)  Bowley,  A.  L.,  Prices  and  Wages  in  the  United  Kingdom,  1914-20, 

Oxford,  1920  (Clarendon  Press). 

(24)  March,  L.,  "  Rapport  sur  les  indices  de  la  situation  economique," 

Bulletin  de  Vlnstitut  International  de  Statistique,  t.  xxi.,  pt.  2,  p.  3. 

(25)  GiNi,  C,  "  Quelques  considerations  au  sujet  de  la  construction  des 

nombres  indices  des  prix,  etc.,"  Metron,  vol.  iv.,  1924,  p.  3. 

(26)  Edgeworth,  F.  Y,,  "  The  Plurality  of  Index  Numbers,"  Economic 

Journal,  vol.  xxxv.,  1925,  p.  379. 

(27)  Edgeworth,  F.  Y.,  "  The  Element  of  Probability  in  Index  Numbers," 

Jour.  Roy.  Stat.  Soc,  vol.  Ixxxviii.,  1925,  p.  557. 

(28)  Bowley,  A.  L.,  "  The  Influence  on  the  Precision  of  Index  Numbers 

of  the  Correlation  between  the  Prices  of  Commodities,"  Jour.  Roy. 
Stat.  Soc,  vol.  Ixxxix.,  1926,  p.  300. 

Correlation  :  General,  and  History  (p.  188). 

(29)  Pearson,  K.,  "  Notes  on  the  History  of  Correlation,"  Biometrika,  vol. 

xiii.,  1920,  p.  25. 

(30)  Baten,  W.  D.,  "  Correction  for  the  Moments  of  a  Frequency  Distri- 

bution in  Two  Variables,"  An7i.  Math.  Stats.,  vol.  ii.,  1931,  p.  309. 

(31)  Frisch,  Ragnar,  "  Correlation  and  Scatter  in  Statistical  Variables," 

Nordic  Statistical  Journal,  vol.  i.,  1929,  p.  36. 

Fit  of  Regression  Lines  (p.  209). 

(32)  Pearson,  Karl,  "  On  the  Application  of  Goodness  of  Fit  Tables  to  test 

Regression  Curves  and  Theoretical  Curves  used  to  describe  Observa- 
tional or  Experimental  Data,"  Biometrika,  vol.  xi.,  1916-17,  p.  237. 
(Criticises  and  extends  the  work  of  Slutsky.) 

(33)  Fisher,  R.  A.,  "  The  Goodness  of  Fit  of  Regression  Formulae,  and  the 

Distribution  of  Regression  Coefficients,"  Jour  Roy.  Stat.  Soc,  vol. 
Ixxxv.,  1922,  p.  597. 

Correlation  in  Case  of  Non-linear  Regression  (p.  209). 

(34)  WiCKSELL,  S.  D.,  "  On  Logarithmic  Correlation,  with  an  Application  to 

the  Distribution  of  Ages  at  First  Marriage,"  Meddelande  fran  Lunds 
Astronomiska  Observatorium,  No.  84,  1917.  Sveiiska  Aktuarie- 
forenings  Tidskrift. 


392 


THEORY  OF  STATISTICS. 


(35)  WiCKSELL,  S.  D.,  "  The  Correlation  Function  of  Type  A,"  Kungl. 

Svenska  Vetenskapsakademiens  Handl.,  Bd.  Iviii.,  1917. 

(36)  Pearson,  K.,  "  On  a  General  Method  of  Determining  the  Successive 

Terms  in  a  Skew  Regression  Line,"  Biometrika,  vol.  xiii.,  1921,  p.  296. 

(37)  Pearson,  Karl,  "  On  the  Correction  necessary  for  the  Correlation 

Ratio       Biometrika,  vol.  xiv.,  1923,  p.  412. 

For  fitting  of  polynomials,  see  under  Correlation :  Time-problem. 

Correlation :  Effect  of  Errors  of  Observation,  etc.  (p.  225). 

(38)  Hart,  Bernard,  and  C.  Spearman,  "  General  Ability,  its  Existence 

and  Nature,"  Brit.  Jour.  Psychology,  vol.  v.,  1912,  p.  51. 

There  has  been  a  good  deal  of  controversy  about  these  formulae  and 
their  applications  in  psychological  work  :  c/.  (267)  Brown  and  Thom- 
son, and  the  references  there  given,  critical  notice  of  the  same  in 
Brit.  Jour.  Psych.,  vol.  xii.,  1921,  p.  100,  and — 

(39)  Stead,  H.  G.,  "The  Correction  of  Correlation  Coefficients,"  Jour.  Roy. 

Stat.  Soc,  vol.  Ixxxvi.,  1923,  p.  412. 

Standardisation  or  Correction  of  Death-rates  (p.  226). 

For  the  methods  of  standardisation  in  present  use  in  England  and 
Wales  see — 

(40)  Seventy -fourth  Annual  Report  of  the  Registrar -General  of  Births,  Deaths, 

and  Marriages  in  England  and  Wales  (1911).    [Cd.  6578,  1913.] 

Reference  may  also  be  made  to — 

(41)  WoLFENDEN,  H.  H.,  "  On  the  Methods  of  comparing  the  Mortalities 

of  Two  or  More  Communities,  and  the  Standardisation  of  Death- 
rates,"  Jour.  Roy.  Stat.  Soc,  vol.  Ixxxviii.,  1923,  p.  399. 

Correlation :  Time -problem,  Fitting  of  Trends,  etc.  (p.  208), 
and  Miscellaneous  (p.  226). 

(42)  Harris,  J.  Arthur,  "  The  Correlation  between  a  Component,  and 

between  the  Sum  of  Two  or  More  Components,  and  the  Sum  of  the 
Remaining  Components  of  a  Variable,"  Quart.  Pub.  American  Stat. 
Assoc.,  vol.  XV.,  1917,  p.  854. 

(43)  Yule,  G.  U.,  "  On  the  Time-correlation  Problem,"  Jour.  Roy.  Stat.  Soc, 

vol.  Ixxxiv.,  1921,  p.  497. 

(44)  WiCKSELL,  S.  D.,  "  An  Exact  Formula  for  Spurious  Correlation," 

Metron,  vol.  i..  No.  4,  1921,  p.  33. 

(45)  Pearson,  Karl,  and  E.  M.  Elderton,  "  On  the  Variate  Difference 

Method,"  Biometrika,  vol.  xiv.,  1923,  p.  281. 

(46)  Anderson,  0.,  "  Ueber  ein  neues  Verfahren  bei  Anwendung  der 

'  Variate -Difference  '  Metliode,"  Biometrika,  vol.  xv.,  1923,  p.  134. 

(47)  Yule,  G.  U.,  "  Why  do  we  sometimes  get  Nonsense  Correlations 

between  Time-Series  ?  A  Study  in  Sampling  and  the  Nature  of 
Time-Series,"  Jour.  Roy.  Stat.  Soc,  vol.  Ixxxix.,  1926,  p.  1. 

(48)  Anderson,  0.,  "  Ueber  die  Anwendung  der  Differenzenmethode 

(Variate  Difference  Method)  bei  Reihenausgleichungen,  Stabilitats- 
untersuchungen,  und  Korrelationsmessungen,"  Bio7netrika,  vol.  xviii., 
1926,  p.  293. 

(49)  GuMBEL,  E.  J.,  "  Spurious  Correlation  and  its  Significance  in  Physi- 

ology," Jour.  Amer.  Stat.  Assoc,  vol.  xxi.,  1926,  p.  179. 


SUPPLEMENTS — ADDITIONAL  REFERENCES.  393 


(50)  Smith,  B.  B.,  "  Combining  the  Advantages  of  First-difference  and 

Deviation-from-Trend  Methods  of  Correlating  Time  Series,"  Jour. 
Amer.  Stat.  Assoc.,  vol.  xxi.,  1926,  p.  55. 

(51)  Anderson,  0.,  "  On  the  Logic  of  the  Decomposition  of  Statistical 

Series  into  Separate  Components,"  Jour.  Roy.  Stat.  Soc,  vol.  xc, 
1927,  p.  548. 

(52)  HoTELLiNG,  H.,  "  An  Application  of  Analysis  Situs  to  Statistics,"  Bull. 

Amer.  Math.  Soc,  July- August  1927,  p.  467. 

(53)  IssERLis,  L.,  "  Note  on  Chebysheff's  Interpolation  Formula,"  Bio- 

metrika,  vol.  xix.,  1927,  p.  87.    (Fitting  polynomials.) 

(54)  Anderson,  Oskar,  Die  Korrelationsrechming   in  der  Konjimktur- 

forschung  (Frankfurter  Gesellschaft  fiir  Konjunkturforschung),  Kurt 
Schroeder,  Bonn,  1929. 

(55)  Darmois,  G.,  "  Analyse  et  comparaison  des  series  statistiques  qui  se 

developpent  dans  le  temps,"  Metron,  vol.  viii.,  Nos.  1-2,  1929,  p.  211. 

(56)  Jordan,  Charles,  "  Sur  la  determination  de  la  tendance  seculaire  des 

grandeurs  statistiques  par  la  methode  des  moindres  carres,"  Jour,  de 
la  Societe  Hongroise  de  Statistique,  vol.  vii.,  1929,  p.  567. 

(57)  Working,  H.,  and  H.  Hotelling,  "Applications  of  the  Theory  of 

Error  to  the  Interpretation  of  Trends,"  Jour.  Amer.  Stat.  Assoc., 
vol.  xxiv.,  1929,  supplt.  p.  73. 

(58)  Allan,  F.  E.,  "The  General  Form  of  the  Orthogonal  Polynomials  for 

Simple  Series,  with  Proofs  of  their  Simple  Properties,"  Proc.  Boy. 
Soc.  Edin.,  vol.  1.,  1930,  p.  310. 

(59)  Rhodes,  E.  C,  "  On  the  Fitting  of  Parabolic  Curves  to  Statistical 

Data,"  Jour.  Roy.  Stat.  Soc,  vol.  xciii.,  1930,  p.  569. 

(60)  Sipos,  Alexander,  "  Practical  Application  of  Jordan's  Method  for 

Trend  Measurement,"  Victor  Hornyanszky  Co.,  Ltd.,  Budapest, 
1930. 

(61)  Will,  Harry  S.,  "  On  Fitting  Curves  to  Observational  Series  by  the 

Method  of  Differences,"  Ann.  Math.  Stats.,  vol.  i.,  1930,  p.  159. 

(62)  Frisch,  Ragnar,  "  A  Method  of  Decomposing  an  Empirical  Series  into 

its  Cyclical  and  Progressive  Components,"  Jour.  Amer.  Stat.  Assoc, 
vol.  xxvi.,  1931,  supplt.  p.  73. 

(63)  Macaulay,  F.  G.,  "  Smoothing  of  Time  Series,"  New  York,  National 

Bureau  of  Economic  Research,  1931. 

Partial  Correlation  and  Partial  Correlation  Ratio  (p.  252). 

(64)  Kelley,  T.  L.,  "  Tables  to  facilitate  the  Calculation  of  Partial  Coeffi- 

cients of  Correlation  and  Regression  Equations,"  Bulletin  of  the 
University  of  Texas,  No.  27,  1916.    (Tables  giving  the  values  of 

V^{l-rl){\-r:J    and    r^/J^{y -r\^{l -r^)^ 

(65)  Pearson,  Karl,  "  On  the  Partial  Correlation  Ratio,"  Proc.  Roy.  Soc, 

Series  A,  vol.  xci.,  1915,  p.  492. 

(66)  Isserlis,  L.,  "  On  the  Partial  Correlation  Ratio  ;  Part  ii.,  Numerical," 

Biometrika,  vol.  xi.,  1916-17,  p.  50. 

(67)  Miner,  J.  R.,  Tables  of  Vl  -r^  and  1  -r^  for  use  in  partial  Correlation, 

etc.  The  Johns  Hopkins  Press,  Baltimore,  1922,    (Six-figure  tables.) 

(68)  Camp,  Burton  H.,  "  Mutually  Consistent  Multiple  Regression  Surfaces," 

Biometrika,  vol.  xvii.,  1925,  p.  443. 

(69)  Kelley,  T.  L.,  and  F.  S.  Salisbury,  "  An  Iteration  Method  for 

determining  Multiple  Correlation  Constants,"  Jour.  Amer.  Stat.  Assoc, 
vol.  xxi.,  1926,  p.  282. 

(70)  Ezekiel,  Mordecai,  "  The  Determination  of  Curvilinear  Regression 


394 


THEORY  OF  STATISTICS. 


Surfaces  in  the  Presence  of  Other  Variables,"  Jour.  Amer.  Stat.  Assoc., 
vol.  xxi.,  1926,  p.  310. 

(71)  Hall,  Philip,  "  Multiple  and  Partial  Correlation  Coefficients  in  the  case 

of  an  w-Fold  Variate  System,"  Biometrika,  vol.  xix.,  1927,  p.  100. 

(72)  Tappan,  M.,  "  On  Partial  Multiple  Correlation  Coefficients  in  a  Universe 

of  Manifold  Characteristics,"  Biometrika,  vol.  xix.,  1927,  p.  39. 

(73)  TscHUPROW,  A.  A.,  transl.  by  L.  Isseelis,  "  The  Mathematical  Theory 

of  the  Statistical  Methods  employed  in  the  Study  of  Correlation  in  the 
case  of  Three  Variables,"  Trans.  Camb.  Phil.  Soc,  vol.  xxiii.,  1928, 
p.  337. 

(74)  EzEKiEL,  M.,  "  The  Application  of  the  Theory  of  Error  to  Multiple  and 

Curvilinear  Correlation,"  Jour.  Amer.  Stat.  Assoc.,  vol.  xxiv.,  1929, 
supplt.  p.  99. 

(75)  Kelley,  T.  L.,  and  Q.  McNemae,  "  Doolittle  versus  the  Kelley-Salis- 

bury  Iteration  Method,  for  Computing  Multiple  Regression  Coeffi- 
cients," Jour.  Amer.  Stat.  Assoc.,  vol.  xxiv.,  1929,  p.  164. 

(76)  Irwin,  J.  0.,  "Mathematical  Theorems  involved  in  the  Analysis  of 

Variance,"  Jour.  Roy.  Stat.  Soc,  vol.  xciv.,  1931,  p.  284. 
See  also  the  book  by  Ezekiel,  reference  (299). 

Sampling  of  Attributes  (p.  273). 

(77)  Detlefsen,  J.  A.,  "  Fluctuations  of  Sampling  in  a  Mendelian  Popula- 

tion," Genetics,  vol.  iii.,  1918,  p.  599. 

(78)  Rhodes,  E.  C,  "  On  the  Problem  whether  two  given  Samples  can  be 

supposed  to  have  been  drawn  from  the  same  Population,"  Biometrika, 
vol.  xvi.,  1924,  p.  239,  and  Metron,  vol.  v.,  1925,  p.  3. 

(79)  Pearson,  Karl,  "  On  the  Difference  and  the  Doublet  Tests  for  Ascer- 

taining whether  Two  Samples  have  been  drawn  from  the  same 
Population,"  Biometrika,  vol.  xvi.,  1924,  p.  249. 

See  also  under  Binomial,  Normal  Curve,  etc.,  below,  and  the 
General  References  for  Probable  Errors  on  p.  397. 

The  Law  of  Small  Chances  (p.  273). 

(80)  Bortkiew^icz,  L.  von,  "  Realismus  und  Formalismus  in  der  mathe- 

matischer  Statistik,"  Allgemein.  Stat.  Arch.,  vol.  ix.,  1916,  p.  225. 
(Continues  the  discussion  initiated  hy  the  paper  of  Miss  Whitaker, 
cited  on  p.  273.) 

(81)  Greenwood,  M.,  and  G.  Udny  Yule,  "  On  the  Statistical  Interpreta- 

tion of  some  Bacteriological  Methods  employed  in  Water  Analysis," 
Journal  of  Hygiene,  vol.  xvi.,  1917,  p.  36.  (Applies  a  criterion 
developed  from  Poisson's  limit  to  the  discrimination  of  water  analj'ses; 
numerous  arithmetical  examples.) 

(82)  "  Student,"  "  An  Explanation  of  Deviations  from  Poisson's  Law  in 

Practice,"  Biometrika,  vol.  x.,  1919,  p.  211. 

(83)  BoRTKiEwicz,  L.  VON,  "  Ueber  die  Zeitfolge  Zufalliger  Ereignisse," 

Bull,  de  rinstitut  Int.  de  Stat.,  tome  xx.,  2^  livr.,  1915. 

(84)  MoRANT,  G.,  "  On  Random  Occurrences  in  Space  and  Time  when 

followed  by  a  Closed  Interval,"  Biometrika,  vol.  xiii.,  1921,  p.  309. 
See  also  references  114,  115. 

Binomial,  Normal  Curve,  and  other  Frequency  Curves 
(p.  314). 

(85)  Thiele,  T.  N.,  "The  Theory  of  Observations,"  .4????.  Math.  Stats., 

vol.  ii.,  1931,  p.  165.  (A  complete  reprint  of  a  work  now  out  of  print 
and  inaccessible,  issued  in  1903.) 


SUPPLEMENTS — ADDITIONAL  REFERENCES.  395 


(86)  Pearson,  Karl,  "  Second  Supplement  to  a  Memoir  on  Skew  Variation," 

Phil.  Trans.  Roy.  Soc,  Series  A,  vol.  ccxvi.,  1916,  p.  429.  (Completes 
the  description  of  type  frequency  curves  contained  in  references  (1) 
and  (3)  of  p.  105.) 

The  advanced  student  who  desires  to  compare  the  merits  of  dif?erent 
frequency  systems  proposed,  should  consult  refs.  (87)  and  (89). 

(87)  Charlier,  C.  V.  L.,  Numerous  papers  issued  from  the  Astronomical 

Department  of  Lund,  1906-12,  especially  "  Contributions  to  the 
Mathematical  Theory  of  Statistics  "  (1912). 

(88)  DoDD,  E.  L.,  "  On  Ordinary  Plane  and  Skew  Curves,"  Bulletin  of  the 

Univ.  of  Texas,  No.  222,  1912. 

(89)  Edgeworth,  F.  Y.,  "  On  the  Mathematical  Representation  of  Statis- 

tical Data,"  Jour.  Roy.  Stat.  Soc,  vol.  Ixxix.,  1916,  p.  456  ;  Ixxx., 
pp.  65,  266,  411  ;  Ixxxi.,  1918,  p.  322. 

(90)  SoPER,  H.  E.,  Frequency  Arrays,  Cambridge  University  Press,  1922. 

(91)  Camp,  B.  H.,  "Probability  Integrals  for  the  Point  Binomial,"  Bio- 

metrika,  vol.  xvi.,  1924,  p.  163. 

(92)  Edgeworth,  F.  Y.,  "Untried  Methods  of  Representing  Frequency," 

Jour.  Roy.  Stat.  Soc,  vol.  Ixxxvii.,  1924,  p.  571. 

(93)  RoMANOVSKY,  V.,  "Generalisation  of  some  Types  of  the  Frequency 

Curves  of  Professor  Pearson,"  Biometrika,  vol.  xvi.,  1924,  p.  106. 

(94)  Pearson,  Karl,  "  Historical  Note  on  the  Origin  of  the  Normal  Curve 

of  Errors,"  Biometrika,  vol.  xvi.,  1924,  p.  402. 

(95)  Camp,  B,  H.,  "  Probability  Integrals  for  a  Hypergeometrical  Series," 

Biometrika,  vol.  xvii.,  1925,  p.  61. 

(96)  DoDD,  E.  L.,  "The  Frequency  Laws  of  a  Function  of  Variables  with 

given  Frequency  Laws,"  Annals  of  Mathematics,  vol.  xxvii.,  1925, 
p.  12. 

(97)  DoDD,  E.  L.,  "The  Frequency  Law  of  a  Function  of  One  Variable," 

Bull.  Amer.  Math.  Soc,  vol.  xxxi.,  1925. 

(98)  Rhodes,  E.  C,  "  On  the  Generalised  Law  of  Error,"  Jour.  Roy.  Stat. 

Soc,  vol.  Ixxxviii,  1925,  p.  576. 

(99)  Edgeworth,  F.  Y.,  "Mr  Rhodes's  Curve  and  the  Method  of  Adjust- 

ment," Jour.  Roy.  Stat.  Soc,  vol.  Ixxxix,,  1926,  p.  129. 

(100)  Charlier,  C.  V.  L.,  "A  New  Form  of  the  Frequency  Function," 

Meddelandp,  Lunds  Astronomiska  Observatorium,  1928. 

(101)  Cram^ir,  H.,  "  On  some  Classes  of  Series  used  in  Mathematical 

Statistics,"  De7i  sjette  Skandinaviske  Matematikercongres,  Copen- 
hagen, 1928. 

(102)  Cramer,  H.,  "  On  the  Composition  of  Elementary  Errors,"  Skandi- 

navisk  Aktuarietidskrift,  1928. 

(103)  Geary,  R.  C,  "The  Frequency  Distribution  of  the  Quotient  of  Two 

Normal  Variables,"  Jour,  Roy.  Stat.  Soc,  vol.  xciii.,  1930,  p.  442. 

(104)  Salvosa,  L.  R.,  "  Tables  of  Pearson's  Type  III.  Function,"  Aim. 

Math.  Stats.,  vol.  i.,  1930,  p.  191. 

(105)  Dodd,  E.  L.,  "  Classification  of  Sizes  and  Measures  by  Frequency 

Functions,"  Jour.  Amer.  Stat.  Assoc,  vol.  xxvi.,  1931,  p.  277. 
(A  survey  :  useful  references.) 

(106)  KoNDO,  T.,  and  E.  M.  Elderton,  "  Tables  of  the  Functions  of  the 

Normal  Curve  to  Ten  Decimal  Places,"  Biometrika,  vol,  xxii., 
1931,  p.  368. 

(107)  RiETZ,  H.  L.,  "  On  certain  Properties  of  Frequency  Distributions 

obtained  by  a  Linear  Fractional  Transformation  of  the  Variates 
of  a  given  Distribution,"  Ann.  Math.  Stats.,  vol.  ii.,  1931,  p.  38. 

(The  above  are  concerned  with  the  general  theory  of  frequency 


396 


THEORY  OF  STATISTICS. 


systems  ;  the  following  deal  with  the  forms  which  are  suitable  for 
the  representation  of  particular  classes  of  data,  e.g.  statistics  of 
epidemic  diseases,  statistics  of  accidents,  etc.) 

(108)  Brownlee,  J.,  "  The  Mathematical  Theory  of  Random  Migration 

and  Epidemic  Distribution,"  Proc.  Roy.  Soc.  Edin.,  vol.  xxxi., 
1910-11,  p.  262. 

(109)  Brownlee,  J.,  "  Certain  Aspects  of  the  Theory  of  Epidemiology  in 

Special  Reference  to  Plague,"  Proc.  Roy.  Soc.  Medicine,  Sect.  Epi- 
demiology and  State  Medicine,  vol.  x.  D,  1918,  p.  85.  (The  appendix 
to  this  paper  summarises  the  author's  results  and  those  of  Sir  Ronald 
Ross  ;  vide  infra.) 

(110)  Ross,  Sir  Ronald,  "  An  Application  of  the  Theory  of  Probabilities 

to  the  Study  of  a  priori  Pathometry,"  Proc.  Roy.  Soc,  A,  vol.  xcii., 
1916,  p.  204. 

(111)  Ross,  Sir  Ronald,  and  Hilda  P.  Hudson,  "An  Application  of  the 

Theory  of  Probabilities  to  the  Studj^  of  a  priori  Pathometrv,"  Pts.  II. 
and  III.,  Proc.  Roy.  Soc,  A,  vol.  xciii.,  1917,  pp.  212  and  225. 

(112)  Knibbs,  G.  H.,  "  The  Mathematical  Theory  of  Population,"  Appendix 

A  to  vol.  i.  of  Census  of  the  Coyytmonwealth  of  Australia.  (Contains 
a  full  discussion  of  the  application  of  various  frequency  systems  to 
vital  statistics.) 

(113)  MoiR,  H.,  "  Mortality  Graphs,"  Trans.  Actuarial  Soc.  America,  vol. 

xviii.,  1917,  p.  311.  (Numerous  graphs  of  mortality  rates  in  different 
classes  and  periods.) 

(114)  Greenwood,  M.,  and  G.  U.  Yule,  "  An  Enquiry  into  the  Nature  of 

Frequency  Distributions  representative  of  Multiple  Happenings, 
with  particular  reference  to  the  Occurrence  of  Multiple  Attacks  of 
Disease  or  of  Repeated  Accidents,"  Jour.  Roy.  Stat.  Soc,  vol. 
Ixxxiii.,  1920,  p.  255. 

(115)  Newbold,  Ethel  M.,  "  Practical  Applications  of  the  Statistics  of 

Repeated  Events,  particularly  to  Industrial  Accidents,"  Jour.  Roy. 
Stat.  Soc,  vol.  xc,  1927,  p.  487. 

Goodness  of  Fit  (p.  315  and  p.  370). 

(116)  Pearson,  Karl,  "  On  a  Brief  Proof  of  the  Fundamental  Formula  for 

testing  the  Goodness  of  Fit  of  Frequency  Distributions  and  on  the 
Probable  Error  of  P,"  Phil.  Mag.,  vol.  xxx.  D  (6th  ser.),  1916,  p.  369. 

(117)  Pearson,  Karl,  "  Multiple  Cases  of  Disease  in  the  same  House," 

Biometrika,  vol.  Ix.,  1913,  p.  28.  (A  modification  of  the  goodness- 
of-fit  test  to  cover  such  statistics  as  those  indicated  by  the  title.) 

(118)  Fisher,  R.  A.,  "  On  the  Interpretation  of  X'  from  Contingency  Tables, 

and  the  Calculation  of  P,"  Jour.  Roy.  Stat.  Soc,  vol.  Ixxxv.,  1922, 
p.  87. 

(119)  Yule,  G.  U.,  "  On  the  Application  of  the  X'  ^If'thod  to  Association 

and  Contingency  Tables,  with  experimental  illustrations,"  Jour.  Roy. 
Stat.  Soc,  vol.  Ixxxv,,  1912,  p.  95.  After  correspondence  with  Mr 
Fisher  I  wish  to  withdraw  the  statement  on  p.  97  of  this  paper, 
that  a  full  proof  [of  the  general  theorem  as  applied  to  contingency 
tables]  seems  still  to  be  lacking  :  he  has  convinced  me  that  his  proof 
covers  the  case. 

The  five  following  bear  on  the  two  preceding  papers  : — 

(120)  Pearson,  Karl,  "  On  the      Test  of  Goodness  of  Fit,"  Biometrika^ 
'  vol.  xiv.,  1922,  p.  186  ;  and  "  Further  Note,"  ibid.,  p.  418. 

(121)  Bowley,  a.  L.,  and  R.  L.  Connor,  "  Tests  of  Correspondence  between 

Statistical  Grouping  and  Formulae,"  Eco7iomica,  1923,  p.  1. 


SUPPLEMENTS — ADDITIONAL  REFERENCES. 


397 


(122)  Fisher,  R.  A.,  "  Statistical  Tests  of  Agreement  between  Observation 

and  Hypothesis  "  (with  a  note  in  reply  by  A.  L.  Bowley),  Economica, 
1923,  p.  139. 

(123)  Fisher,  R.  A.,  "  The  Conditions  under  which      measures  the  dis- 

crepancy between  Observation  and  Hypothesis,"  Jour.  Roy.  Stat. 
Soc,  vol.  Ixxxvii.,  1924,  p.  442. 

(124)  Irwin,  J.  0.,  "  Note  on  the      Test  for  Goodness  of  Fit,"  Jour.  Roy. 

Stat.  Soc,  vol.  xcii.,  1929,  p.  264. 

(125)  Sheppard,  W.  F.,  "  The  Fit  of  a  Formula  for  Discrepant  Observa- 

tions," Phil.  Trans.  Roy.  Soc,  A,  vol.  ccxxviii.,  1929,  p.  228. 

(126)  Neyman,  J.,  and  Egon  S.  Pearson,  "  Further  Notes  on  the  y~  Distri- 

bution," Biomeirika,  vol.  xxii.,  1931,  p.  298. 
See  also  references  32,  33,  and  167. 

Normal  Correlation^  and  Other  Correlation  Surfaces 
(p.  332). 

(127)  Pearson,  Karl,  and  Others  (editorial),  "Tables  for  Determining  the 

Volumes  of  a  Bi-variate  Normal  Surface,"  Biomeirika,  vol.  xxii., 
1930,  p.  1. 

(128)  Pretorius,  S.  J.,  "  Skew  Bi-variate  Frequency  Surfaces,  examined 

in  the  Light  of  Numerical  Illustrations,"  Biometrika,  vol.  xxii., 
1930,  p.  109. 

Probable  Errors,  Sampling,  etc.:  General  References 

(p.  355). 

(129)  Tchebycheff,  P.  L.  de,  "  Des  valeurs  moyennes,"  Journal  de 

Mathematiques  (2),  vol.  xii.,  1867,  pp.  177-84. 

(130)  DoDD,  E.  L.,  "  The  Probability  of  the  Arithmetic  Mean  compared 

with  that  of  certain  other  Functions  of  the  Measurements,"  Annals 
of  Mathematics,  vol.  xiv.,  1912-13. 

(131)  IssERLis,  L.,  "  On  the  Value  of  a  Mean  as  calculated  from  a  Sample," 

Jour.  Roy.  Stat.  Soc,  vol.  Ixxxi.,  1918,  p.  75. 

(132)  SoPER,  H.  E.,  and  Others,  "  On  the  Distribution  of  the  Correlation 

Coefficient  in  Small  Samples,"  Biometrika,  vol.  xi.,  1916-17,  p.  328. 

(133)  Pearson,  Karl,  "  On  the  Probable  Error  of  Biserial  ?^,"  Biometrika, 

vol.  xi.,  1916-17,  p.  292. 

(134)  Young,  Andrew,  and  Karl  Pearson,  "  On  the  Probable  Error  of  a 

Coefficient  of  Contingency  without  Approximation,"  Biometrika, 
vol.  xi.,  1916-17,  p.  215. 

(135)  Pearson,  Karl  (editorial),  "  On  the  Probable  Errors  of  Frequency 

Constants,"  Pt.  III.,  Biometrika,  vol.  xiii.,  1920,  p.  113. 

(136)  "  Student,"  "  An  Experimental  Determination  of  the  Probable  Error 

of  Dr  Spearman's  Correlation  Coefficients,"  Biometrika,  vol.  xiii., 
1921,  p.  263. 

(137)  Bispham,  J.  W.,  "An  Experimental  Determination  of  the  Distribu- 

tion of  the  Partial  Correlation  Coefficient  in  Samples  of  Thirty," 
Proc  Roy.  Soc,  A,  vol.  xcvii.,  1920,  and  Metron,  vol.  ii.,  1923, 
p.  684. 

(138)  Tschuprow,  a.  A.,  "  On  the  Mathematical  Expectation  of  the 

Moments  of  Frequency  Distributions,"  Biometrika,  vol.  xii,,  1918- 
19,  pp.  140  and  185,  and  vol.  xiii.,  1921,  p.  283  ;  and  Metron, 
vol.  ii.,  1923,  pp.  461  and  646. 

(139)  Fisher,  R.  A,,  "  On  the  Probable  Error  of  a  Coefficient  of  Correlation 

deduced  from  a  Small  Sample,"  Metron,  vol.  i..  No.  4,  1921,  p.  3. 


398  THEORY  OF  STATISTICS. 

140)  Fisher,  R.  A.,  "  On  the  Mathematical  Foundations  of  Theoretical 
Statistics,"  Phil.  Trans.,  A,  vol.  ccxxii.,  1922,  p.  309. 

141)  Camp,  Burton  H.,  "  A  New  Generalisation  of  Tchebycheff's  Statis- 
tical Inequality,"  Bull.  Amer.  Math.  Soc,  vol.  xxviii.,  1922. 

142)  Meidell,  H.  Birger,  "  Sur  un  probleme  du  calcul  des  probabilites 
et  les  statistiques  mathematiques,"  Comptes  Rendus,  vol.  clxxv., 
1922,  p.  806. 

143)  Camp,  Burton  H.,  "  Problems  in  Sampling,"  Jour.  Amer.  Stat.  Assoc., 
vol.  xviii.,  1923,  p.  964. 

144)  DoDD,  E.  L,,  "  The  Greatest  and  the  Least  Variate  under  General 
Laws  of  Error,"  Trans.  Amer.  Math.  Soc,  vol.  xxv.,  1923,  p.  525. 

145)  Meidell,  H.  Birger,  "  Sur  la  probabilite  des  erreurs,"  Comptes 
Rendus,  vol.  clxxvi.,  1923,  p.  280. 

146)  Pearson,  E.  S.,  "  The  Probable  Error  of  a  Class-index  Correlation," 
Biometrika,  vol.  xiv.,  1923,  p.  261. 

147)  Fisher,  R,.  A.,  "  The  Distribution  of  the  Partial  Correlation  Co- 
efficient," Metron,  vol.  iii.,  1924,  p.  329. 

148)  Pearson,  E.  S.,  "  Note  on  the  Approximations  to  the  Probable  Error 
of  a  Coefficient  of  Correlation,"  Biometrika,  vol.  xvi.,  1924,  p.  196. 

[149)  Church,  A.  E.  R.,  "  On  the  Moments  of  the  Distribution  of  Squared 
Standard  Deviations  for  Samples  of  N  drawn  from  an  indefinitely 
large  Population,"  Biometrika,  vol.  xvii.,  1925,  p.  79. 

150)  Fisher,  R.  A.,  "  The  Theory  of  Statistical  Estimation,"  Proc.  Camb. 
Phil.  Soc,  vol.  xxii.,  1925,  p.  700. 

151)  HoTELLiNG,  Harold,  "  The  Distribution  of  Correlation  Ratios  Calcu- 
lated from  Random  Data,"  Proc  Nat.  Acad.  Sci.,  vol.  xi.,  1925, 
p.  657. 

152)  Pearson,  Karl,  "  Further  Contributions  to  the  Theory  of  Small 
Samples,"  Biometrika,  vol.  xvii.,  1925,  p.  176. 

153)  Splawa-Neyman,  J.,  "  Contributions  to  the  Theory  of  Small  Samples 
drawn  from  a  Finite  Population,"  Biometrika,  vol.  xvii.,  1925, 
p.  472. 

154)  Fisher,  R.  A.,  "  Applications  of  '  Student's  '  Distribution  "  (and 
following  Tables  by  "Student"),  Metron,  vol.  v.,  No.  3,  p.  90, 
1925. 

155)  TscHUPROW,  A.  A.,  "  On  the  Asymptotic  Frequency  Distributions  of 
the  Arithmetic  Means  of  n  Correlated  Observations  for  very  great 
Values  of  w,"  Jour.  Roy.  Stat.  Soc,  vol.  Ixxxviii.,  1925,  p.  91. 

156)  DoDD,  E.  L.,  "  The  Convergence  of  a  General  Mean  of  Measurements 
to  the  True  Value,"  Bull.  Amer.  Math.  Soc,  vol.  xxxii.,  1926. 

157)  Rhodes;  E.  C,  "  The  Comparison  of  Two  Sets  of  Observations," 
Jour.  Roy.  Stat.  Soc,  vol.  Ixxxix.,  1926,  p.  544. 

158)  Church,  A.  E.  R.,  "  On  the  Means  and  Squared  Standard  Deviations 
of  Small  Samples  from  any  Population,"  Biometrika,  vol.  xviii., 
1926,  p.  321. 

159)  Dodd,  E.  L.,  "  The  Convergence  of  General  Means  and  the  Invariance 
of  Form  of  certain  Frequency  Functions,"  Amer.  Jour.  Math., 
vol.  xlix.,  1927. 

160)  Greenwood,  M.,  and  L.  Isserlis,  "  An  Historical  Note  on  the 
Problem  of  Small  Samples,"  Jour.  Roy.  Stat.  Soc,  vol.  xc,  1927, 
p.  347. 

161)  Hall,  Philip,  "The  Distribution  of  Means  for  Samples  of  Size  N 

drawn  from  a  Population  in  which  the  Variate  takes  Values  between 
0  and  1,  all  such  Values  being  Equally  Probable,"  Biometrika,  vol. 
xix.,  1927,  p.  240. 

162)  Irwin,  J.  0.,  "  On  the  Frequency  Distribution  of  the  Means  of 


SUPPLEMENTS — ADDITIONAL  REFERENCES.  309 


Samples  from  a  Population  having  any  Law  of  Frequency  with 
Finite  Moments,  etc.,"  Biometrika,  vol.  xix.,  1927,  p.  225,  and 
vol.  xxi.,  1929,  p.  431. 

(163)  Rhodes,  E.  C,  "  The  Precision  of  Means  and  Standard  Deviations 

when  the  Individual  Errors  are  Correlated,"  Jour.  Roy.  Stat.  Soc, 
vol.  xc,  1927,  p.  135. 

(164)  Fisher,  R.  A.,  "  The  General  Sampling  Distribution  of  the  Multiple 

Correlation  Coefficient,"  Proc.  Roy.  Soc,  A.,  vol.  cxxi.,  1928,  p.  654. 

(165)  Fisher,  R.  A.,  "Moments  and  Product  Moments  of  Sampling  Distri- 

butions," Proc.  London  Math.  Soc,  vol.  xxx.,  1928,  p.  199. 

(166)  Fisher,  R.  A.,  and  L.  H.  C.  Tippett,  "  Limiting  Forms  of  the  Fre- 

quency Distribution  of  the  Largest  or  Smallest  Member  of  a  Sample," 
Proc.  Camb.  Phil.  Soc,  vol.  xxiv.,  1928,  p.  180. 

(167)  Neyman,  J.,  and  E.  S.  Pearson,  "  On  the  Use  and  Interpretation  of 

Certain  Test  Criteria  for  Purposes  of  Statistical  Inference,"  Bio- 
metrika, vol.  XX.  A,  1928  and  1929,  p.  175  and  p.  263. 

(168)  WiSHART,  John,  "  The  Generalised  Product  Moment  Distribution  in 

Samples  from  a  Normal  Multivariate  Population,"  Biometrika,  vol. 
XX.  A,  1928,  p.  32. 

(169)  Craig,  C.  C,  "  Sampling  when  the  Parent  Population  is  of  Pearson's 

Type  III.,"  Biometrika,  vol.  xxi.,  1929,  p.  287. 

(170)  Fisher,  R.  A.,  "  Tests  of  Significance  in  Harmonic  Analysis,"  Proc 

Roy.  Soc,  A.,  vol.  cxxv.,  1929,  p.  54. 

(171)  Holzinger,  K.  S.,  and  A.  E.  R.  Church,  "  On  the  Means  of  Samples 

from  a  U-shaped  Population,"  Biometrika,  vol.  xx.  a,  1929,  p.  361. 

(172)  Irwin,  J.  0.,  "  On  the  Frequency  Distribution  of  any  Number  of 

Deviates  from  the  Mean  of  a  Sample  from  a  Normal  Population  and 
the  Partial  Correlations  between  them,"  Jour.  Roy.  Stat.  Soc,  vol. 
xcii.,  1929,  p.  580. 

(173)  KoNDO,  T.,  "  On  the  Standard  Error  of  the  Mean  Square  Contin- 

gency," Biometrika,  vol.  xxi.,  1929,  p.  376. 

(174)  Pearson,  Egon  S.,  and  N.  K.  Adyanthaya,  "  The  Distribution  of 

Frequency  Constants  in  Small  Samples  from  Non-normal  Sym- 
metrical and  Skew  Populations  :  Second  Paper,  Distribution  of 
'Student's'  2,"  Biometrika,  vol.  xxi.,  1929,  p.  259. 

(175)  Pearson,  Egon  S.,  "  Some  Notes  on  Sampling  Tests  with  Two 

Variables,"  Biometrika,  vol.  xxi.,  1929,  p.  337. 

(176)  Pearson,  Karl,  G.  B.  Jeffery  and  E.  M.  Elderton,  "  On  the 

Distribution  of  the  First  Product-moment  Coefficient  in  Small 
Samples  drawn  from  an  Indefinitely  Large  Normal  Population," 
Biometrika,  vol.  xxi.,  1929,  p.  164. 

(177)  Pepper,  Joseph,  "  Studies  in  the  Theory  of  Sampling,"  Biometrika, 

vol.  xxi.,  1929,  p.  231.  (The  general  theory  of  sampling  from  any 
bi-variate  population.) 

(178)  Rider,  Paul  R.,  "  On  the  Distribution  of  the  Ratio  of  Mean  to 

Standard  Deviation  in  Small  Samples  from  Non-normal  Universes," 
Biometrika,  vol.  xxi.,  1929,  p.  124. 

(179)  Romano vsKY,  V.,  "  On  the  Moments  of  Means  of  Functions  of  One 

and  More  Random  Variables,"  Metron,  vol.  viii.,  Nos.  1  and  2, 
1929,  p.  251. 

(180)  Shohat,  J.  (.Jacques  Chokhate),  "  Inequalities  for  Moments  of  Fre- 

quency Functions  and  for  Various  Statistical  Constants,"  Bio- 
rnetrika,  vol.  xxi.,  1929,  p.  361. 

(181)  SoPER,  H.  E.,  "  The  General  Sampling  Distribution  of  the  Multiple 

Correlation  Coefficient,"  Jour.  Roy.  Stat.  Soc,  vol.  xcii.,  1929,  p.  445. 

(182)  Wishart,  John,  "  The  Correlation  between  Product  Moments  of  any 


400  THEORY  OF  STATISTICS. 

Order  in  Samples  from  a  Normal  Population,"  Proc.  Boy.  Soc.  Edin., 
vol.  xlix.,  1929,  p.  1. 

(183)  Woo,  T.  L.,  "  Tables  for  ascertaining  the  Significance  or  Non- 

significance  of  Association  Measured  by  the  Correlation  Ratio," 
Biometrika,  vol.  xxi.,  1929,  p.  1. 

(184)  Baker,  George  A.,  "  The  Significance  of  the  Product-moment 

Coefficient,  with  special  reference  to  the  Marginal  Distributions," 
Jour.  Amer.  Stat.  Assoc.,  vol.  xxv.,  1930,  p.  387  ;  and  the  related 
Paper:  Pearson,  Egon  S.,  "The  Test  of  the  Significance  for  the 
Correlation  Coefficient,"  Jour.  Amer.  Stat.  Assoc.,  vol.  xxvi.,  1931, 
p.  128. 

(185)  Baker,  George  A.,  "  Distribution  of  the  Means  of  Samples  of  n 

drawn  at  random  from  a  Population  represented  by  a  Gram- 
Charlier  Series,"  Ann.  Math.  Stats.,  vol.  i.,  1930,  p.  199,  and  note 
by  C.  C.  Craig,  ibid.,  vol.  ii.,  1931,  p.  99. 

(186)  Baker,  George  A.,   "  Random  Samples  from  Non-homogeneous 

Populations,"  Metron,  vol.  viii..  No.  3,  1930,  p.  67. 

(187)  Berkson,  Joseph,  "  Bayes'  Theorem,"  Ann.  Math.  Stats.,  vol.  i., 

1930,  p.  42. 

(188)  EzEKiEL,  MoRDECAi,  "  The  Sampling  Variability  of  Linear  and 

Curvilinear  Regression,"  Ann.  Math.  Stats.,  vol.  i.,  1930,  p.  275. 

(189)  Fisher,  R.  A.,  "  Inverse  Probability,"  Proc.  Camb.  Phil.  Soc,  vol. 

xxvi.,  1930,  p.  528. 

(190)  Fisher,  R.  A.,  "  The  Moments  of  the  Distribution  for  Normal  Samples 

of  Measures  of  Departure  from  NormaUty,"  Proc.  Boy.  Soc,  A., 
vol.  cxxx.,  1930,  p.  16. 

(191)  Hotelling,  H.,  "  The  Consistency  and  Ultimate  Distribution  of 

Optimum  Statistics,"  Trans.  Amer.  Math.  Soc,  vol.  xxxii,,  1930, 
p.  847. 

(192)  Irwin,  J.  0.,  "  On  the  Frequency  Distribution  of  the  Means  of 

Samples  from  Populations  of  certain  of  Pearson's  Types,"  Metron, 
vol.  vii.,  No.  4,  1930,  p.  51. 

(193)  KoNDO,  T.,  "  A  Theory  of  the  Sampling  Distribution  of  Standard 

Deviations,"  Biometrika,  vol.  xxii.,  1930,  p.  36. 

(194)  Pearson,  Egon  S.,  "  A  Further  Development  of  Tests  for  Normalitv," 

Biometrika,  vol.  xxii.,  1930,  p.  239. 

(195)  Pearson,  Egon  S.,  and  J.  Neyman,  "  On  the  Problem  of  Two 

Samples,"  Bull,  de  VAcad.  Polonaise  des  Sci.  et  des  Lettres,  Series  A, 
1930,  p.  73. 

(196)  Smith,  C.  D.,  "  On  Generalised  Tchebycheff  Inequalities  in  Mathe- 

matical Statistics,"  Amer.  Jour.  Math.,  vol.  lii.,  No.  1,  1930. 

(197)  Soper,  H.  E.,  "  Sampling  Moments  of  Moments  of  Samples  of  n 

Units  each  drawn  from  an  Unchanging  Sampled  Population,  from 
the  Point  of  View  of  Semi-invariants,"  Jour.  Boy.  Stat.  Soc,  vol. 
xciii.,  1930,  p.  104. 

(198)  WiSHART,  J.,   "  The  Derivation  of  certain  High-order  Sampling 

Product  Moments  from  a  Normal  Population,"  Biometrika,  vol. 
xxii.,  1930,  p.  224. 

(199)  BoRTKiEWicz,  L.  VON,  "  The  Relation  between  Stability  and  Homo- 

geneity," Ann.  Math.  Stats.,  vol.  ii.,  1931,  p.  1. 

(200)  Craig,  C.  C,  "  Sampling  in  the  Case  of  Correlated  Observations," 

Ann.  Math.  Stats.,  vol.  ii.,  1931,  p.  324. 

(201)  Hotelling,  H.,  "The  Generalisation  of  'Student's'  Ratio,"  Ann. 

Math.  Stats.,  vol.  ii.,  1931,  p.  360. 

(202)  McKay,  A.  T.,  "  The  Distribution  of  the  Estimated  Coefficient  of 

Variation,"  Jour.  Boy.  Stat.  Soc,  vol.  xciv.,  1931,  p.  564. 


SUPPLEMENTS — ADDITIOXAL  REFERENCES. 


401 


(203)  Molina,  E.  C,  "Bayes'  Theorem,"^ ^in.  Math.  Stats.,  vol.  ii.,  1931,  p.  25. 

(204)  Pearson,  Karl,  and  Brenda  Stoessiger,  "  Tables  of  the  Probability 

Integrals  of  Symmetrical  Frequency  Curves  in  the  Case  of  Low 
'    Powers,  such  as  arise  in  the  Theory  of  Small  Samples,"  Biometrika, 
vol.  xxii.,  1931,  p.  253. 

(205)  Pearson,  Karl,     On  the  Xature  of  the  Relationship  between  Two 

of  'Student's'  Variates  {z^  and  z.,)  when  Samples  are  taken  from  a 
Bi-variate  Normal  Population,"  Biometrika,  vol.  xxii.,  1931,  p.  405. 

(206)  Rider,  Paul  R.,  "  On  Small  Samples  from  certain  Xon-normal 

Universes,"  Ann.  Math.  Stats.,  vol.  ii.,  1931,  p.  48. 

(207)  WiSHART,  J.,  "  The  Mean  and  Second-moment  Coefficient  of  the 

Multiple  Correlation  Coefficient  in  Samples  from  a  Normal  Popula- 
tion," Biometrika,  vol.  xxii.,  1931,  p.  353.  (With  an  Editorial 
appendix  of  tables  of  the  mean  value  and  squared  standard  devia- 
tion of  a  multiple  correlation  coefficient.) 

On  the  problem  of  fluctuations  of  sampling  in  correlations  between 
time-series,  see  also  Yule  (47). 

General. 

(208)  Irwin,  J.  0.,  "  Recent  Advances  in  Mathematical  Statistics,"  Jour. 

Boy.  Stat.  Soc,  vol.  xciv.,  1931,  p.  568.  (A  useful  survey,  with 
references,  of  the  work  of  1930  :  a  similar  article  promised  on  the 
work  of  1931.) 

Tables  of  Functions,  etc.  (p.  358). 

(209)  Pearson,  Karl,  Tables  of  the  Incomplete  Gamma-Function,  H.M. 

Stationery  Office,  London,  1922.    Price  £2,  2s.  Od.  net. 

(210)  Pearson,  Karl  (edited  by),  Tables  for  Statisticians  ayid  Biornetricians, 

Part  II.,  1931.  To  be  obtained  from  the  Secretary,  Biometric 
Laboratory,  University  CoUege,  London,  England.  Price  30s., 
post  free.  (Part  I.,  now  in  its  second  edition,  price  15s.,  is  now 
only  to  be  had  from  the  same  address.) 

(211)  British  Association  Mathematical  Tables,  vol.  i.,  London,  1931.  Office 

of  the  British  Association,  Burlington  House,  London,  W.l,  price 
10s.,  post  free.  (Circular  and  Hyperbolic  Functions  ;  Exponential 
Sine  and  Cosine  Integrals  ;  Factorial  (Gamma)  and  Derived  Func- 
tions ;  Integrals  of  Probability  Integral.  Many  tables  useful  for 
modern  statistical  work.) 

Errors  of  Sampling  in  Agricultural  Experiment. 

A  good  deal  of  work  has  been  done  on  this  particular  branch  of 
the  subject,  and  the  following  references  may  be  useful : — 

(212)  Berry,  R.  A.,  and  D.  G.  O'Brien,  "  Errors  in  Feeding  Experiments 

with  Cross-bred  Pigs,"  Jour.  Agr.  Sci.,  vol.  xi.,  1921,  p.  275. 

(213)  Harris,  J.  A.,     On  a  Criterion  of  Substratum  Homogeneity  (or 

Heterogeneity)  in  Field  Experiments,"  Amer.  Naturalist,  1916, 
p.  430. 

(214)  Hall,  A.  D.,  E.  J.  Russell,  T.  B.  Wood,  S.  U.  Pickering,  S.  H. 

Collins,  "  The  Interpretation  of  the  Results  of  Agricultural  Experi- 
ments," Journal  of  the  Board  of  Agriculture,  Supplement  7,  1911. 
(Contains  a  collection  of  papers  on  error  in  field  trials,  feeding 
experiments,  horticultural  work,  milk-testing,  etc.) 

26 


402  THEORY  OF  STATISTICS. 

Lyon,  T.  L.,  "  Some  Experiments  to  Estimate  Errors  in  Field  Plat 

Tests,"  Proc,  Amer.  Soc.  of  Agronomy,  vol.  iii.,  1911,  p.  89. 
Mercer,  W.  B.,  and  A.  D.  Hall,  "  The  Experimental  Error  of  Field 
Trials,"  Jour.  Agr.  Sci.,  vol.  iv.,  1911,  p.  107.    (With  an  appendix 
by  "  Student "  describing  the  chessboard  method  of  conducting 
yield  trials.) 

Mitchell,  H.  H.,  and  H.  S.  Grindley,  "  The  Element  of  Uncertainty 
in  the  Interpretation  of  Feeding  Experiments,"  Univ.  of  Illinois 
Agr.  Expt.  Station,  Bull.  165,  1913. 
Robinson,  G.  W.,  and  W.  E.  Lloyd,  "  On  the  Probable  Error  of 
Sampling  in  Soil  Surveys,"  Jour.  Agr.  Sci.,  vol.  viii.,  1915,  p. 
144. 

Surface,  F.  M.,  and  Raymond  Pearl,  "  A  Method  of  Correcting  for 
Soil  Heterogeneity  in  Variety  Tests,"  Jour.  Agr.  Research,  vol.  v., 
1916,  p.  1039. 

Wood,  T.  B.,  and  R.  A.  Berry,  "  Variation  in  the  Chemical  Composi- 
tion of  Mangels,"  Jour.  Agr.  Sci.,  vol.  i.,  1905,  p.  16. 
Wood,  T.  B.,  "  The  Feeding  Value  of  Mangels,"  Jour.  Agr.  Sci., 

vol.  iii.,  1910,  p.  225. 
Wood,  T.  B.,  and  F.  J.  M.  Stratton,  "  The  Interpretation  of  Experi- 
mental Results,"  Jour.  Agr.  Sci.,  vol.  iii.,  1910,  p.  417. 
Beaven,  E.  S.,  "  Trials  of  New  Varieties  of  Cereals,"  Jour.  Min. 

Agric.  (England  and  Wales),  vol.  xxix.,  1922,  pp.  337  and  436. 
"  Student,"  "  On  Testing  Varieties  of  Cereals,"  Biometrika,  vol.  xv., 

1923,  p.  271,  and  supplementary  note,  vol.  xvi.,  1924,  p.  411. 
Hatton,  R.  G.,  N.  H.  Grubb,  and  R.  C.  Knight,  "  Black  Currant 
Trials,"  Jour,  of  Pomology  and  Horticultural  Science,  vol.  iv.,  1925, 
p.  2. 

Hayes,  H.  K.,  "  Control  of  Soil  Heterogeneity  and  Use  of  the  Probable 
Error  Concept  in  Plant-Breeding  Studies,"  Univ.  Minnesota  Agric. 
Expt.  Stn.,  Tech.  Bull.  30,  1925. 
Trought,  Trevor,  "  A  Statistical  Note  on  the  Cotton  Variety  Tests 
at  Sakha,  1916-20,"  Min.  Agric.  Egypt,  Tech.  and  Sci.  Service,  Bull. 
51,  1925. 

Bailey,  M.  A.,  and  T.  Trought,  "  An  Account  of  Experiments  carried 
out  to  Determine  the  Experimental  Error  of  Field  Trials  with  Cotton 
in  Egypt,"  Min.  Agric.  Egypt,  Tech.  and  Sci.  Service,  Bull,  63, 
1926. 

Engledow,  F.  L.,  "  A  Census  of  an  Acre  of  Corn,"  Jour.  Agr.  Sci., 
vol.  xvi.,  1926,  p.  166  ;  and  later  papers  of  the  series  in  vols, 
xviii.,  xix.,  xx. 

Engledow,  F.  L.,  and  G.  U.  Yule,  "  The  Principles  and  Practice 
of  Yield-Trials,"  Empire  Cotton-Growing  Corporation,  Millbank 
House,  Millbank,  London,  S.W.I,  1926,  revised  edition,  1930. 
Price  2s.    (Reprint  from  the  Empire  Cotton-Orowing  Review.) 
Fisher,  R.  A.,  "  The  Arrangement  of  Field  Experiments,"  Jour.  Min. 

Agric.  (England  and  Wales),  1926. 
Lord,  L.,  "  The  Preliminary  Testing  of  Pure  Line  Selections  of  Rice," 

Tropical  Agriculturist,  vol.  Ixvii.,  1926. 
"  Student,"  "  Mathematics  and  Agronomy,"  Jour.  Amer.  Soc.  Agro- 
nomy, vol.  xviii.,  1926. 
Eden,  T.,  and  R.  A.  Fisher,  "  The  Experimental  Determination  of 
the  Value  of  Top-dressings  with  Cereals,"  Jour.  Agr.  Sci.,  vol.  xvii., 
1927,  p.  548. 

HuBBACK,  J.  A.,  "  Sampling  for  Rice  Yield  in  Bihar  and  Orissa," 
Agric.  Research  Institute,  Pusa,  Bull.  166,  1927. 


SUPPLEMENTS — ADDITIONAL  REFERENCES.  403 


(236)  Moller-Arnold,   E.,   "  Untersuchungen  iiber  Moglichkeiten  der 

Verminderung  der  Fehler  von  Feldversuchungen  in  der  Praxis," 
Landw.  Jahrb.,  vol.  Ixv.,  1927,  p.  943. 

(237)  Hayes,  H.  K.,  and  F.  R.  Immer,  "  A  Study  of  Probable  Error  Methods 

in  Field  Experiments,"  Sci.  Agric,  vol.  viii.,  1928,  p.  345. 

(238)  Maskell,   E.   J.,    "  Experimental   Error,"    Tropical  Agriculture, 

Trinidad,  vol.  v.,  1928,  p.  306,  and  vol.  vi.,  1929,  pp.  5,  45,  97. 

(239)  Neyman",  J.,  "  The  Theoretical  Basis  of  Different  Methods  of  Testing 

Cereals.  1.  The  Method  of  E.  Zaleski."  (In  English.  Reprint 
from  the  Journal  Wiadomosci  Matematyczne,  1928.)  Scientific 
Publications  of  K.  Buszczynski  &  Sons,  Ltd.,  No.  1,  Pedigree  Seed 
Cultures,  Warsaw. 

(240)  RoEMER,  T.,  "  Les  essais  comparatifs  de  rendements,"  Bull.  Assoc. 

Int.  Select.  Plantes  Grande  Culture,  vol.  i.,  1928,  p.  158. 

(241)  Clapham,  a.  R.,  "  The  Estimation  of  Yield  in  Cereal  Crops  by 

Sampling  Methods,"  Jour.  Agr.  Sci.,  vol.  xix.,  1929,  p.  214. 

(242)  Moller-Arnold,  E.,  Der  Feldversuch  in  der  Praxis,  Julius  Springer, 

Berlin,  1929. 

(243)  WiSHART,  J.,  and  A.  R.  Clapham,  "  A  Study  in  Sampling  Technique  : 

the  Effect  of  Artificial  Fertilisers  on  the  Yield  of  Potatoes,"  Jour. 
Agr.  Sci.,  vol.  xix.,  1929,  p.  589. 

(244)  Fisher,  R.  A.,  and  J.  Wlshart,  "  The  Arrangement  of  Field  Experi- 

ments and  the  Statistical  Reduction  of  the  Results,"  Technical 
Communication  No.  10  of  the  Imperial  Bureau  of  Soil  Science, 
H.M.  Stationery  Office,  London,  1930.    (Price  Is.  net.) 

(245)  JoRGENSEN,  M.,   "  Om  Beregning  af  Usikkerheden  paa  Forsogs- 

resultater,"  Tidsskr.  Planteavl.,  vol.  xxxvi.,  1930,  p.  149. 

(246)  KiNDERMANN,  M.,  "  Uutersuchungen  iiber  die  giinstigste  Grosse  von 

Versuchsteilstiicken,"  Landw.  Jahrb.,  vol.  Ixxii.,  1930,  p.  141. 

(247)  Maskell,  E.  J.,   "  Field  Experiments  on  Sugar-cane,"  Tropical 

Agriculture,  Trinidad,  vol.  vii.,  1930,  pp.  101,  125. 

(248)  Mitscherlich,  E.  A.,  "  Die  Beurteilung  der  Ergebnisse  von  Sorten- 

und  Stammanbauversuchen,"  Z.  Zuchtung,  1930,  p.  223. 

(249)  RiCHEY,  F.  D.,  "  Some  Applications  of  Statistical  Method  to  Agro- 

nomic Experiments,"  Jour.  Amer.  Stat.  Assoc.,  vol.  xxv.,  1930, 
p.  269. 

(250)  RoEMER,  T.,  "  Der  Feldversuch,  eine  kritische  Studie,"  Z.  Zuchtung, 

1930,  p.  483.  (Dritte  Auflage  :  beim  Bezuge  durch  die  D.L.G., 
Berlin.) 

(251)  Sanders,  H.  G.,  "  A  Note  on  the  Value  of  Uniformity  Trials  for 

Subsequent  Experiments,"  Jour.  Agr.  Sci.,  vol.  xx.,  1930,  p.  63. 

(252)  Behrens,  W.  U.,  "  Zur  Fehlerberechnung  bei  Feldversuchen  nach 

der  Methode  Knut  Vik,"  Pflanzenbau,  vol.  viii.,  1931,  p.  31. 

(253)  Christidis,  B.  G.,  "  The  Importance  of  the  Shape  of  Plot  in  Field 

Experimentation,"  Jour.  Agr.  Sci.,  vol.  xxi.,  1931,  p.  14. 

(254)  Clapham,  A.  R.,  and  T.  Wake  Simpson,  "  Studies  in  Sampling 

Technique  :  Cereal  Experiments.  I.  Field  Technique,"  Jour.  Agr. 
Sci.,  vol.  xxi.,  1931,  p.  366. 

(255)  Eden,  T.,  "  The  Experimental  Errors  of  Field  Experiments  with  Tea," 

Jour.  Agr.  Sci.,  vol.  xxi.,  1931,  p.  547. 

(256)  HoBLYN,  T.  L.,  "  Field  Experiments  in  Horticulture,"  Technical 

Communication  No.  2,  Imperial  Bureau  of  Fruit  Production, 
1931. 

(257)  Papadakis,  J.,  "  Some  Considerations  on  the  Technique  of  Field 

Experiments,"  Bull.  Assoc,  Int.  Select.  Plantes  Grande  Culture, 
vol.  iv.,  1931,  p.  59. 


404 


THEORY  OF  STATISTICS. 


(258)  Tedin,  O.,  "  The  Influence  of  Systematic  Plot  Arrangement  upon 

the  Estimate  of  Error  in  Field  Experiments,"  Jour.  Agr.  Sci., 
vol.  xxi.,  1931,  p.  191. 

(259)  WiSHART,  J.,  "  The  Analysis  of  Variance  Illustrated  in  its  Applica- 

tion to  a  Complex  Agricultural  Experiment  on  Sugar-beet,"  Archiv 
fur  Pflanzenbau,  Bd.  5,  1931,  p.  561. 

Applications  of  Statistical  Method  to  Engineering  Problems. 

This  is  also  a  branch  on  which  much  work  has  been  done  of  recent  years, 
but  it  is  one  with  which  I  am  so  wholly  unfamiliar  that  I  cannot  undertake 
to  give  any  detailed  bibliography.  The  following  books  may  be  found 
useful,  and  will  give  references : — 

(260)  Becker,  R.,  H.  Plaut,  und  I.  Runge,  Anwendungen  der  mathe- 

matischen  Statistik  auf  Problenie  der  Massenfabrilcation,  Julius 
Springer,  Berlin,  1927.    (Reprint  1930.) 

(261)  Fry,  T.  C,  Probability  and  its  Engineering  Uses,  London,  Macmillan 

&  Co.  ;  New  York,  D.  van  Nostrand  &  Co. ;  1928. 

(262)  KoHLWEiLER,  Emil,  Statistik  ini  Dienste  der  Technik,  R.  Oldenbourg, 

Miinchen  und  Berlin,  1931. 

The  "  Reprints  "  of  the  Bell  Telephone  Laboratories  Incorporated,  New 
York,  include  a  number  coming  under  the  present  head.  Mention  may  be 
made  in  particular  of  Reprint  B-297  (reprinted  from  the  Journal  of  the 
Franklin  Institute^  vol.  ccv.,  1928)  :  Economic  Aspects  of  Engineering 
Applications  of  Statistical  Methods,  by  W.  A.  Shewhart,  with  a  bibliography. 

Works  on  Theory  of  Statistics,  Probability,  etc. 
(App.  IL,  p.  361). 

(263)  Bachelier,  L.,  Calcul  des  probabilites,  tome  i.,  Gauthier-Villars, 

Paris,  1912. 

(264)  Bachelier,  L.,  Lejeu,  la  chance,  et  le  hasard,  Flammarion,  Paris,  1914. 

(265)  BowLEY,  A.  L.,  Elements  of  Statistics,  P.  S.  King,  London,  5th  ed., 

1926.  (Part  II.,  "  Applications  of  Mathematics  to  Statistics,"  can 
be  purchased  separateh\) 

(266)  BowLEY,  A.  L.,  Elementary  Manual  of  Statistics,  Macdonald  and 

Evans,  London,  4th  ed.,  1928.  (A  new  edition  of  this  elementary 
work,  to  which  reference  is  made  in  Appendix  II.,  p.  360.  Part  II., 
dealing  with  dift'erent  groups  of  official  statistics,  has  been  largely 
rewritten.) 

(267)  Brown,  W.,  and  G.  H.  Thomson,  The  Essentials  of  Mental  Measure- 

ment, 2nd  ed.,  Cambridge  University  Press,  1921. 

(268)  Brunt,  David,  The  Combination  of  Observations,  Cambridge  Uni- 

versity Press,  1917. 

(269)  CzuBER,  E.,  Die  stat.  Forschungsmethode,  L.  W.  Seidel,  Wien,  1921. 

(270)  Elderton,  W.  Palin,  Frequency  Curves  and  Correlation,  2nd  ed., 

London,  C.  &  E.  Lay  ton,  1927. 

(271)  Fisher,  Arne,  The  Mathematical  Theory  of  Probabilities  and  its 

Application  to  Frequency  Curves  and  Statistical  Methods,  vol.  i.,  New 
York  (Macmillan),  1915  :  2nd  ed.,  enlarged,  1922. 

(272)  FoRCHER,  Hugo,  Die  statistische  Jlcthode  als  selbstiindige  Wissenschaft, 

Leipzig,  1913  (Veit). 

(273)  Henry, "a..  Calculus  and  Probability  for  Actuarial  Students,  C.  &  E. 

Layton,  London,  1922. 

(274)  Jones,  D.  C,  A  First  Course  in  Staiistics,  Bell  &  Sons,  London.  1921. 


SUPPLEMENTS — ADDITIONAL  REFERENCES. 


405 


(275)  JuLiN,  A.,  Principes  de  statistique  theorique  et  appliquee  :   tome  i., 

Statisque  theorique,  Paris  (Riviere),  Bruxelles  (Dewit),  1921. 

(276)  Keynes,  J.  M.,  A  Treatise  on  Probability,  Macmillan,  London,  1921. 

(277)  "West,  C.  J.,  Introduction  to  Mathematical  Statistics,  Adams  &  Co., 

Columbus,  1918. 

An  inexpensive  reprint  of  Laplace's  Essai  philosophique  (ref.  17  on  p.  361) 
has  been  published  by  Gauthier-Villars  (Paris,  1921)  in  the  series  entitled 
"  Les  maitres  de  la  pensee  scientifique." 


During  recent  years  interest  in  statistical  method  has  been  evidenced  by  j 
the  issue  of  a  rapidly  increasinn;  number  of  books  on  the  subject.  Of  those  j 
in  the  following  list,  the  first  five  and  (288)  to  (290)  will  all  be  found  useful 
as  supplementing  the  present  volume.  Pearl's  work  is  specially  intended 
for  those  interested  in  vital  statistics,  but  Mill  be  useful  also  to  others. 
Kelley's  book  covers  a  great  deal  of  ground  not  touched  in  the  present 
volume  and,  though  more  critical  discussion  of  some  of  the  methods  seems 
to  me  desirable,  the  student  will  find  much  that  is  not  otherwise  accessible 
in  volume  form.  In  the  very  useful  handbook  edited  by  H.  L.  Rietz,  each  i 
chapter  is  written  by  a  specialist :  chapters  on  Interpolation,  Curve  Fitting, 
and  Periodogram  Analysis,  for  example,  all  deal  with  matters  not  discussed 
in  this  Introduction.  R.  A.  Fisher's  Statistical  Methods  is  a  laboratory 
handbook  rather  than  a  text-book,  and  brings  together  in  convenient  form 
for  the  research  worker  the  numerous  special  methods  developed,  mainly 
by  himself,  with  especial  reference  to  small  samples.  Whittaker  and 
Robinson's  treatise  is  advanced  and  covers  a  wide  field  for  statisticians  and 
others.  The  little  book  by  the  late  Professor  Tschuprow  the  student  may 
not  find  easy  reading,  but  it  deals  with  fundamentals.  The  small  work 
by  Rietz  will  interest  even  the  specialist.  Darmois'  work  is  on  completely 
different  lines  from  the  present  and  is  to  be  recommended  to  the  student 
of  mathematical  ability.  The  book  by  Westergaard  and  Xybolle  is  very 
simply  and  practically  ^Titten,  with  many  examples  ;  there  are  chapters 
on  Interpolation,  Vital  Statistics,  and  Insurance. 

(278)  Pearl,  R.,  Introduction  to  Medical  Biometry  and  Statistics,  W.  B. 

Saunders  Co.,  Philadelphia  and  London,  1923  ;  2nd  ed.  enlarged, 
1930. 

(279)  Kelley,  Truman  L.,  Statistical  Method,  The  Macmillan  Co.,  New 

York,  1923. 

(280)  Rietz,  H.  L.  (edited  by),  Handbook  of  Mathematical  Statistics, 

Houghton  Mifflin  Co.,  Boston,  1924. 

(281)  Fisher,  R.  A.,  Statistical  Methods  for  Research  Workers,  Oliver  and 

Boyd,  Edinburgh  and  London,  3rd  ed.,  1930. 

(282)  Whittaker,  E.  T.,  and  G.  Robinson,  The  Calculus  of  Observations, 

Blackie  &  Son,  London,  1924. 

(283)  Tschuprow,  A.  A.,  Grundbegriffe  und  Grundprobleme  der  Korrelations- 

theorie,  Teubner,  Leipzig,  1925. 

(284)  NiCEFORO,  A.,  La  Methods  Statistique,  Marcel  Giard,  Paris,  1925. 

(285)  Secrist,  H.,  An  Introduction  to  Statistical  Methods,  revised  edition, 

The  Macmillan  Co.,  Xew  York,  1925. 

(286)  Crum,  L.  W.,  and  A.  C.  Patton,  Economic  Statistics,  A.  W.  Shaw  Co., 

Chicago  and  Xew  York,  A.  W.  Shaw  &  Co.,  Ltd.,  London,  1925. 

(287)  Day,  Edmund  E.,  Statistical  Analysis,  The  Macmillan  Co.,  New 

York,  1925. 

(288)  Rietz,  H.  L.,  Mathematical  Statistics,  Open  Court  Publishing  Co., 

Chicago,  1927.    (A  small  work,  one  of  a  series  intended  for  those  ' 


406 


THEORY  OF  STATISTICS. 


who  have  some  mathematical  knowledge  but  are  not  specialists. 
Useful  references.) 

(289)  Darmois,  G.,  Statistigue  Mathematique,  Paris,  Librairie  Octave  Doin, 

1928. 

(290)  Westergaard,  H.,  and  H.  C.  Nybolle,  Grundzilge  der  Theorie  der 

Statistik,  Fischer,  Jena,  1928.  (Nominally  the  2nd  ed.  of  Wester- 
gaard's  work  of  1890  (25,  p.  361),  but  entirely  rewritten.) 

(291)  Jordan,  Charles,  Statistique  Mathematique,  Gauthier-Villars,  Paris, 

1927. 

(292)  BuRNSiDE,  W.,  Theory  of  Probability,  Cambridge  Universitv  Press, 

1928. 

(293)  Chaddock,  Robert  E.,  Principles  and  Methods  of  Statistics,  Houghton 

Mifflin  &  Co.,  Boston,  1928. 

(294)  MiSES,  R.  VON,  Wahrscheinlichkeit,  Statistik  und  Wahrheit,  Springer, 

Berlin,  1928. 

(295)  Banister,  H.,  Elementary  Applications  of  Statistical  Method,  Blackie 

k  Son,  Ltd.,  London  and  Glasgow,  1929.  (A  simple  book  for  begin- 
ners, based  on  experience  with  students  of  psychology.) 

Vital  Statistics. 

The  two  following  books  on  vital  statistics  are  both  revised  editions, 
Newsholme's  book  having  been  completely  rewritten. 

(296)  Newsholme,  Sir  Arthur,  The  Elements  of  Vital  Statistics,  revised 

edition,  Allen  &  Unwin,  London,  1923. 

(297)  Whipple,  G.  C,  Vital  Statistics,  2nd  ed.,  Wiley  &  Sons,  New  York  ; 

Chapman  &  Hall,  London,  1923. 
The  student  of  vital  statistics  who  wishes  to  go  on  to  modern  methods 
should  get  Pearl's  book  (278). 

Books,  Recent. 

The  preceding  lists  give  books  published,  or  of  which  the  first  edition 
was  published,  prior  to  the  revision  for  press  of  the  ninth  edition  (1929) 
of  this  Introduction  to  the  Theory  of  Statistics.  The  following  have  been 
issued  since  that  date  : — 

(298)  KoHN,  Stanislav,  Zdklady  Teoric  Statisticke  Mctody  {Elements  of  the 

Theory  of  Statistical  Method),  published  by  the  State  Statistical 
Office  of  the  Czechoslovak  Republic,  Prague,  1929.  (A  sohd  work 
of  483  pp.  :  detailed  bibliographies.) 

(299)  EzEKiEL,  Mordecai,  Methods  of  Correlation  Analysis,  John  Wilev 

&  Sons,  New  York  ;  Chapman  &  Hall,  London,  1930.  (Full 
treatment  of  methods  of  computation,  especially  the  methods  that 
have  been  developed  by  American  writers  for  handling  problems 
with  many  variables.) 

(300)  Harper,  F.  H.,  Elements  of  Practical  Statistics,  Macmillan,  New 

York,  1930.    (A  manual  for  students  not  framed  in  mathematics.) 

(301)  March,  Lucien,  Les  Principes  dc  la  Mcthodc  Statistique,  FeUx  Alcan, 

Paris,  1930.  (Comprehensive  but  elementary  in  treatment,  and 
very  lucid  in  style,  as  one  has  learned  to  expect  from  the  Honorary 
Director  of  the  Statistique  Generale  de  la  France  :  illustrations  and 
examples  mainly  from  economic  and  demographic  statistics.) 

(302)  Scarborough,  J.   B.,  Numerical  Mathematical  Analysis,  Johns 

Hopkins  University  Press,  Baltimore ;  Milford,  London,  1930. 
(Covers  the  same  sort  of  ground  as  Whittaker  and  Robinson, 
ref.  (282).) 


SUPPLEMENTS — ADDITIONAL  REFERENCES.  407 


(303)  MoNTESSUS  DE  Ballore,  R.  de,  Probabilites  et  Statistiques,  Hermann 

&  Cie,  Paris,  1931.  (Applications  of  the  binomial  series  to  the 
fitting  of  frequency  distributions.) 

(304)  Steffensen,  J.  F.,  Some  Recent  Researches  in  the  Theory  of  Statistics 

and  Actuarial  Science,  Cambridge  University  Press,  1930.  (The 
substance  of  three  lectures  delivered  in  London.) 

(305)  MiSES,  R.  VON,  Wahrscheinlichkeitsrechnung  und  die  Anwendung  in 

der  Statistik  und  theoretische  Physik,  Deuticke,  Wien,  1931. 

(306)  TiPPETT,  L.  H.  C,  The  Methods  of  Statistics,  Williams  &  Norgate,  Ltd., 

London,  1931.  (Useful  to  the  student  already  possessing  some 
knowledge  who  wants  an  introduction  to  the  methods  of  R.  A. 
Fisher,  analysis  of  variance,  etc.    Illustrations  mainly  biological.) 

(307)  Winkler,  Wilhelm,  Grundriss  der  Statistik,  I.  Theoretische  Statistik, 

Julius  Springer,  Berlin,  1931.  (A  section  of  the  Enzyklopddie  der 
Rechts-  und  Staatswissenschaft :  no  knowledge  of  high  ermathe- 
matics  assumed.) 

(308)  Woods,  Hilda  M.,  and  W.  T.  Russell,  An  Introduction  to  Medical 

Statistics,  P.  S.  King  &  Son,  Ltd.,  London,  1931.  (An  elementary 
introduction,  not  only  to  the  special  methods  of  vital  statistics, 
but  to  statistical  method  in  general.) 
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CHAPTER  I. 

1.  N 

{A) 

iO 

2.  {ABC) 
(ABy) 
{A^C) 
(A&y) 

3.  The  frequencies  not  given  in  the  question  itself  are — 
(a)  {AB)  107       {AC)  405       {BC)  525. 

{b)  {A $y)  22,980       (ai?7)  13,585       (a)3C)  96,478       (0)87)  28,868,495. 
{AB)    (B)  .         {AB)  {B) 

{Afi)    (/8)  {AB)  +  {JP)    {B)  +  {&)* 

{AB)    {A)  ^,  ,  .       {AB)  {A) 

(AB)  {A) 
{aB)>\a)' 

5.  {AB)  +  {BC)-{B),  i.e.,  the  sum  ofthe  excesses  of  (^5)  and  (5C)  over  (i5)/2. 
8.  160.    Take  A  =  husband  exceeding  wife  in  first  measurement,  B  = 
husband  exceeding  wife  in  second  measurement,  and  find  (0)8), 


CHAPTER  II. 

1.  80/263  or  304  per  thousand. 

2.  55/85  or  65  per  cent. 

3.  32  per  cent,  and  30  per  cent 

4.  117. 

5.  108. 

8-  P^i  (1  -2?),  p<^k  (1  +  ^3'),  i.e.,  p  niUBt  lie  between  0  and  J  (1  -  2q)  or 
between  ^  (1+2$')  and  ^. 

9.  As  a  hint,  remember  the  condition  that — 

{BC)<t{B)  +  {C)-N. 
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CHAPTER  III. 

1.  Deaf-mutes  from  childhood  per  million  among  males  222  ;  among 
females  183  ;  there  is  therefore  positive  association  between  deaf-mutism  and 
male  sex  :  if  there  had  been  no  association  between  deaf-mutism  and  sex,  there 
would  have  been  3176  male  and  3393  female  deaf-mutes. 

2.  (a)  positive  association,  since  (^^)o  =  1457. 

(6)  negative  association,  since  294/490  =  3/5,  380/570  =  2/3. 
(c)  independence,  since  256/768  =  1/3,  48/144  =  1/3. 

3.  Percentage  of  Plants  above  the  Average  Height. 

Parentage  Crossed.  Self-fertilised. 

Ipomsea  purpurea .       .       .    86  per  cent.  25  per  cent. 

Petunia  violacea  .       .       .    79      ,,  17  ,, 

Reseda  lutea        .       .       .    78  34 

Reseda  odorata    .       .       .    71      ,,  45 

Lobelia  fulgens    .       .       .    50      ,,  35  ,, 

The  association  is  much  less  for  the  species  at  the  end  than  for  those  at  the 
oeginning  of  the  list. 

4.  Percentage  of  dark-eyed  amongst  the  sons  of  dark-eyed  fathers  39  per 
cent. 

Percentage  of  dark-eyed  amongst  the  sons  of  not  dark-eyed  fathers  10  per 
cent. 

If  there  had  been  no  heredity,  the  frequencies  to  the  nearest  unit  would 
have  been  {AB\  18,  {A$)o  111,  {aB)o  121,  (ai8)o  750. 

5.  Percentage  of  light-eyed  amongst  the  wives  of  light-eyed  husbands  59 
per  cent. 

Percentage  of  light-eyed  amongst  the  wives  of  not  light-eyed  husbands  53 
per  cent. 

If  there  had  been  no  association  :  {AB)o  =  298,  (^)8)o  =  225,  (a5)o  =  143,  (o/3)o 
=  108. 

6.  The  following  are  the  proportions  of  the  insane  per  thousand  in 
successive  age  groups  : — 

In  general  population :  0-9,  2'3,  4*1,  57,  6-9,  7*5,  7'7,  B'S. 
Amongst  the  blind  :      20-1,  16'0,  16-3,  20*7,  18-3,  17-8,  ir4,  5*3. 

Note  the  diminishing  association,  which  is  especially  clear  in  the  age-group 
65 — ,  and  the  negative  association  in  the  last  age-group.  The  association 
coefficient  gives  the  values  below,  which  decrease  continuously  : — 

Association  coefficient;  +0-92,  +0-75,  +0-61,  +0-57,  +0  46,  +0-41, 
+  0-20,  -0  13. 


CHAPTER  IV. 


(D)/iV     =6-9  per  cent. 
iAI))l{A)   =45-0  „ 

(&D)m    =  3-6  „ 
M)8Z>)/M/8)=41-2 

{BD)I{B)    =42-7  „ 
{ABI))/{AB)  =  ^l-6 


(A)/N     =6*8  per  cent. 
{AD)j{D)  =44-6 

{Am&)  =4-7  „ 
(^)3Z))/(/8i))=54-9  „ 

{AB)I{B)    =29-2  „ 
iABD)/{BD)  =  3b-3 


The  above  give  two  legitimate  comparisons.  The  general  results  are  the  same 
as  for  the  boys,  i.e.  a  very  small  association  between  development-defects  and 
dulness  amongst  those  exhibiting  nerve-signs,  as  compared  with  those  who  do 
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not  exhibit  nerve  signs,  or  with  the  girls  in  general.  As  the  association 
amongst  those  Avho  do  not  exhibit  nerve-signs  is  quite  as  high  ts  for  the  girls 
in  general,  the  "  conclusion  "  quoted  does  not  seem  valid. 


2. 


{B)IN 
{AB)I{A) 

{BC)/{C) 
{ABC)I{AC) 


(1) 
per 
thousand. 
3-2 
14-9 

38-8 
216 


(2) 
per 
thousand. 
7-5 
117 

63  0 
214 


{A)/N) 
iAB)/{B) 

iAC)/{C) 
{ABC)I{BC) 


(1) 
per 
thousand. 
0-9 
4-0 

6-6 

36-8 


(2) 
per 
thousand. 
4-0 
6-3 

18-8 
63-8 


The  above  give  the  two  simplest  comparisons,  either  of  which  is  sufficient  to 
show  that  there  is  a  high  association  between  blindness  and  mental  derange- 
ment amongst  the  deaf-mutes  as  well  as  in  the  general  pojmlation  ;  amongst 
the  old,  the  association  is,  in  fact,  small  for  the  general  population,  but  well- 
marked  for  deaf-mutes.  This  result  stands  in  direct  contrast  with  that  of 
Qu.  1,  where  the  association  between  the  two  defects  A  and  D  was  much 
smaller  in  the  defective  universe  B  than  in  the  universe  at  large.  As  previously 
stated,  no  great  reliance  can  be  placed  on  the  census  data  as  to  these  infirmities. 

3.  If  the  cancer  death-rates  for  farmers  over  45  and  under  45  respectively 
were  the  same  as  for  the  population  at  large,  the  rate  for  all  farmers  15 — 
would  be  I'll,  This  is  sZ^■^A<Z^/ less  than  the  actual  rate  1*20,  but  the  excess 
would  not  justify  the  statement  that  "farmers  were  peculiarly  liable  to  cancer. " 
It  is,  in  point  of  fact,  due  to  the  further  differences  of  age-distribution  that  we 
have  neglected,  e.g.  amongst  those  over  45  there  are  more  over  55  amongst 
farmers  than  amongst  the  general  population,  and  so  on. 

4.  15  per  cent. 

6.  If  A  and  B  were  independent  in  both  C  and  7  universes,  we  would  have 
{AB)  equal  to 

471  X  419    151  x  139  ^ 
-617 -'  +  -^83-=^^^'^- 

Actually  {AB)  only  =  358.  Therefore  A  and  B  must  be  disassociated  in  one  or 
both  ])artial  universes. 

9.  (1)  68*1  per  cent.  (2)  42 '5  per  cent.  The  fallacy  discussed  in  §  2  is 
now  avoided,  and  there  seems  no  reason  for  declining  to  consider  this  as  evidence 
of  the  effect  of  expenditure  on  election  results. 

10.  The  limits  to  y  are — 

>\{x  +  x'), 


No  inference  of  a  positive 
X  lies  between  the  limits 


subject  to  the  conditions  y'^x,  2/<j:;0,  y<^2x-l. 
association  from  two  negatives  is  possible  unless 
•382  .  .  .  ,  -618  .... 

11.  The  limits  to  y  are  : — 
(1)  y<^(6a;- 6x2-1) 

>\{x  +  %x\ 

subject  to  conditions  y<^0,  <^ix  -  1,  I}>a;. 

An  inference  is  only  possible  from  positive  associations  of  and  AC  ii  x^ 
\  ;  an  inference  is  only  possible  from  two  negative  associations  if  x  lies  between 
•211  ....  and  '274.  .  .  .    Note  that  x  cannot  exceed  ^. 


(2) 


2/<^(6x-3a;2-i) 


subject  to  conditions  y<^^y  ^^^^x-  1,  ^x. 
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No  inference  is  possible  from  positive  associations  of  AB  and  BC. 
An  inference  is  only  possible  from  negative  associations  if  x  lie  between 
•183  .  .  .  .and  '215  ....    Note  that  a:  cannot  exceed  ^. 

(3)  y<UQx- 2x^-1) 

subject  to  the  conditions  y<^0,  <i^5x-  1,  "^J'. 

As  in  (2),  no  inference  is  possible  from  positive  associations  of  and  BO  ; 
an  inference  is  possible  from  negative  associations  if  a;  lie  between  '177  .  .  .  . 
and  "224  ....    Note  that  x  cannot  exceed  ^. 


CHAPTER  V. 
1.    A,  0  68.       B,  0-36. 


CHAPTER  VI. 
1.  1200;  200.    2.  100;  20.    3.  146-25.    4.  216'5. 


CHAPTER  VII. 

2.  Mean,  15673  lb.  Median,  154-67  lb.  Mode  (approx.)  150-6  lb.  (Note 
that  the  mean  and  the  median  should  be  taken  to  a  place  of  decimal?  further 
than  is  desired  for  the  mode :  the  true  mode,  found  by  fitting  a  theoretical 
frequency  curve,  is  151 '1  lb.) 

3.  Mean,  0'6330.  Median,  0  6391.  Mode  (approx.),  0-651.  (True  mode 
is  0  653.) 

4.  £35-5  approximately. 

5.  (1)  116-0.  (2)  Means  77-4,  89-0,  ratio  114-9.  (3)  Geometrical  means  77-2, 
88-9,  ratio  115-2.  (4)  115-2. 

6.  (1)  921,507.  (2)  916,963. 

7.  Istqual.  10s.  6|d.    2nd  qual.  9s.  2|d. 

8.  n.p.  If  the  terms  of  the  given  binomial  series  are  multiplied  by  0,  1,  2,  3 
.  .  .  ,  note  that  the  resulting  series  is  also  a  binomial  when  a  common  factor 
is  removed.    [The  full  proof  is  given  in  Chapter  XV.  §  6.] 


CHAPTER  VIII. 

2.  Standard  deviation  21*3  lb.  Mean  deviation  16  4  lb.  Lower  quartile 
142-5,  upper  quartile  168-4;  whence  ^=12-95.  Ratios:  ra.d./s.d.  =  0*77, 
C/s.d.  =0-61.    Skewness,  0-29. 

3.  Approximately  lower  quartile  =  £26-1,  upper  quartile  =  £54 '6,  ninth 
decile  =  £94. 

5.  (1)  31=73-2,  (r  =  17-3.  (2)  JI/=73-2,  (r  =  l7'5.  (3)  J/=73-2,  (r  =  lS'0. 
(Note  that  while  the  mean  is  unaffected  in  the  second  place  of  decimals,  the 
standard  deviation  is  the  higher  the  coarser  the  grouping. ) 

6.  \/n.pq.    The  proof  is  given  in  Chapter  XV.  §  6. 

7.  The  assumption   that  observations  are  evenly  distributed  over  the 
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intervals  does  not  affect  the  sum  of  deviations,  except  for  the  interval  in  which 
the  mean  or  median  lies  :  for  that  interval  the  sum  is  {0'25  +  cP),  hence  the 
entire  correction  is 

rf(7l,  -713)+ 71.2(0 -25+ if-). 

In  this  expression  d  is,  of  course,  expressed  as  a  fraction  of  the  class-interval, 
and  is  given  its  proper  sign.  Notice  that  the  and  7I3  of  this  question  are 
not  the  same  as  the      and  ^2  of  §  16. 


CHAPTER  IX. 

1.  ••^:=  1-414,  (ry  =  2-280,  r= +0-81.    A'  =  0-5r+0'5.     r=l'3X  +  l-l. 

2.  Using  the  subscripts  1  for  earnings,  2  for  paui)erism,  3  for  out-relief  ratio, 
i/3=579,  0-3  =  3-09  :  ri3=  -O'lS,  r^j^  +0  60. 


CHAPTER  XI. 

1.  1*232  per  cent,  (against  1240  per  cent.) :  2-556  in.  against  2*572  in. 

2.  The  coiTected  standard-deviation  is  0*9954  of  the  rough  value. 

3.  Estimated  true  standard-deviation  6*91  :  stand;\rd-deviation  of  fluctua- 
tions of  sampling  9*38.  (The  latter,  which  can  be  independently  calculated, 
is  too  low,  and  the  former  consequently  probably  too  high.  Cf.  Chap.  XIV, 
§10.) 

4.  0-43. 

5.  58  per  cent. 

6.  (r27V(^7W)W  +  ^ 
7  ^'^i  

8.  0-30. 

The  others  may  be  written  down  from  symmetry. 

10,  (1)  No  effect  at  all.  (2)  If  the  mean  value  of  the  errors  in  variables  is 
d,  and  in  the  weights  e,  the  value  found  for  the  weighted  mean  is — 

The  true  value  +  d-r. (Txa'w:^7:J- — :• 
w{w  +  e) 

If  r  is  small,  d  is  the  important  term,  and  hence  errors  in  the  quantities  are 
usually  of  more  importance  than  errors  in  the  weights.  If  r  become  consider- 
able, errors  in  the  weights  may  be  of  consequence,  but  it  does  not  seem  probable 
that  the  second  term  would  become  the  most  important  in  practical  cases, 

11,  g  =  2/3. 

12,  ^  =  077. 

CHAPTER  XII. 

1.  7-12-3=  +0-759,  ri3-2=  +0-097,  r23  i=  -  0-436. 
(ri.23  =  2-64,  0-213  =  0-594;  (r3-i2  =  70-l. 
Zi  =  9-31  +  3-37  -X'2  + 0-00364  X3. 
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2.  »'i2-34=  +0-680,  ri3.24=  +0-803,  r,4. 23=  +  0-397. 
^23-14=  -0-433,  r24. -0-553,  r34. -  0-149. 

a'l-234=9'17,  0-2.134=49-2,  0-3. 124=12-5,  0-4.123=  105*4. 

A''i  =  53  +  0-127  ^2  +  0*587  X3  +  0-0345  X^. 

3.  The  correlation  of  the  joth  order  is  r/(l  +pr).  Hence  if  r  be  negative,  the 
correlation  of  order  n-2  cannot  be  numerically  greater  than  unity  and  r 
cannot  exceed  (numerically)  l/(n  -  1). 

4.  -ri2. 

5.  ri2.3=  -1,  ri3.3  =  r23.i=  +1. 

6.  »*i2  3  =  ''l3.a  =  '*23'l=  ~ 

CHAPTER  XIII. 

1.  Theo.  3f=6,  0-  =  1-732  :  Actual  J/=6-116,  <r  =  1-732. 

2.  {a)  Theo.  3/=  2-5,  o-=l-118  :  Actual  J/=  2 '48,  o-  =  l-14. 
{h)  ,,  yl/=3,  <r=l-225  :  „  if=2-97,  (r  =  r26. 
(c)     ,,     vl/=3-5,  rr  =  l-323  :     ,,     i/=3-47,  o-  =  l-40. 

3.  Theo.  i/=50,  o-  =  5  :  Actual  4/"=  50 -11,  o-  =  5'23. 

4.  The  standard  deviation  of  the  proportion  is  0-00179,  and  the  actual 
divergence  is  f)  A  times  this,  and  therefore  almost  certainly  significant. 

5.  The  standard  deviation  of  the  number  drawn  is  32,  and  the  actual 
difference  from  expectation  18,    There  is  no  significance. 

6.  p=l-a^lM,  n  =  Mlp  :  j5  =  0-510,  w  =  12-0  :  p  =  0-454,  n=110-4, 

8.  Standard  deviation  of  simple  sampling  23-0  per  cent.  The  actual 
standard-deviation  does  not,  therefore,  seem  to  indicate  any  real  variation,  but 
only  fluctuations  of  sampling. 

9.  Dillerence  from  expectation  7*5  :  standard  error  10-0.  The  difference 
might  therefore  occur  frequently  as  a  fluctuation  of  sampling. 

10.  The  test  can  be  applied  either  by  the  formulae  of  Case  II.  or  Case  III. 
Case  II.  is  taken  as  the  simplest. 

(«)  {AB)l{B)  =  69'l  per  cent.:  (^j8)/(i8)  =  80-0  per  cent.  Difference  lO-Q 
percent.  (^)/iV=71-l  percent,  and  thence  e]2  =  12-9  per  cent.  The  actual 
difference  is  less  than  this,  and  would  frequently  occur  as  a  fluctuation  of 
simple  sampling. 

(b)  {AB)l{B)  =  lO-\  per  cent. :  (^/3)/()8)  =  64-3  per  cent.  Difference  5-8  pel 
cent.  {A)JN'=Q7'Q  per  cent.,  and  thence  €12  =  3 '40  per  cent.  The  actual 
difference  is  1*7  times  this,  and  might,  rather  infrequently,  occur  as  a  fluctua- 
tion of  simple  sampling. 


CHAPTER  XIV. 


Row. 

<rp. 

Group  of  Rows. 

(Tp. 

1 

3-1 

5,  6,  and  7 

2-1 

2 

2-1 

8,  9,  10,  and  11 

1-6 

3 

1-7 

12,  13,  and  14 

12 

4 

2-7 

15  and  upwards 

ri 

ffp  is  given  in  units  per  1000  births,  as  s  and  Sq. 

2.  So  =  7 -02,  and  o-p  =  2-5  units. 

3.  =  71. pq  as  if  the  chance  of  success  were  p  in  all  cases  (but  the  mean  is 
n/2  not p.n). 

4.  Mean  number  of  deaths  per  annum  =  0-0^  =  680, 


0^  =  566,582. 


r  =  0-000029. 
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CHAPTER  XV. 


(1)    0  1  7  792 

1  12  8  495 

2  66  9  220 

3  220  10  66 

4  495  11  12 

5  792  12  1 


6  924 


4  363-9 


(3)     0  192 

1  288 

2  144 

3  24 

Total,  648 


Total,  4096 


(2)     0  459-4  5  116-4 

1  1102-6  6  27-2 

2  1212-8  7  4-7 

3  808-6  8  -6 


Total.  4096-2 


2.  The  frequency  of  r  successes  is  greater  than  that  of  r  - 1  so  long  as 
r<np+p  :  if  np  is  an  integer,  r  =  np  gives  the  greatest  terra  and  also  the  mean. 

3.  This  follows  at  once  from  a  consideration  of  the  Gal  ton- Pearson  apparatus. 


4.                            Binomial  Normal  curve. 

1  1-7 

10  10-5 

45  42-7 

120  1161 

210  211-5 

252  258  -4 

210  211-5 

etc.  etc. 


5.  The  data  are  J/=68-855,  (r  =  2'56,  y^=155-8. 

6.  (1)  United  Kingdom — direct  1-75,  from  standard-deviation  1-73. 
(2)  Cambridge  students — direct  1-88,  from  standard-deviation  1'73. 

7.  70-6  per  cent.    8.  27  per  cent. 

9.  (1)  In  a  12*4  per  cent,,  b  I'O  percent,  of  the  trials,  assuming  normality, 
but  the  assumption  is  hardly  quite  valid.  (2)  a  about  13  times  in  100,000 
trials  ;  b  practically  impossible,  being  a  deviation  of  over  7  times  the  standard 
error. 

10.  853.    11.  Mean  74*3,  standard-deviation  3-23. 


CHAPTER  XVI. 

3.  From  equations  (10)  and  (11)  replace  o-j  and  a-^  by  5]  and  2-2  in  equation 
(9).  Regarding  this  as  an  equation  for  r,  note  that  r'^  is  a  maximum  when 
tan  2  d  is  infinite,  or  0  =  45°. 
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4.  In  fig.  50,  suppose  every  horizontal  array  to  be  given  a  slide  to  the  right 
until  its  mean  lies  on  the  vertical  axis  through  the  mean  of  the  whole  distribu- 
tion :  then  suppose  the  ellipses  to  be  squeezed  in  the  direction  of  this  vertical 
axis  until  they  become  circles.  The  original  quadrant  has  now  become  a 
sector  with  an  angle  between  one  and  two  right  angles,  and  the  question  is 
solved  on  determining  its  magnitude. 


CHAPTER  XVII. 

1.  Estimated  frequency  1554,  standard  error  0"28  lb.  2.  Lower  Q, 
frequency  1472,  standard  error  0*26  lb.  ;  upper  Q,  frequency  1116,  standard 
error  0-34  lb.  3.  0*18  lb.  4.  0-24  lb.,  17  per  cent,  less  than  the  standard 
error  of  the  median.  5.  0  0196  in.  or  076  percent,  of  the  standard-deviation  : 
the  standard  error  of  the  semi-interquartile  range  is  1"23  per  cent,  of  that 
range. 


r. 

w  =  100. 

71  =  1000. 

00 

0-1 

0-0316 

0-2 

0-096 

0-0304 

0-4 

0-084 

0-0-266 

0-6 

0-064 

0-0202 

0-8 

0-03ti 

0-0114 

INDEX. 


[The  references  are  to  pages.  The  subject-matter  of  the  Exercises  given  at 
the  ends  of  the  chapters  has  been  indexed  only  when  such  exercises  (or 
the  answers  thereto)  give  the  constants  for  statistical  tables  in  the  text, 
or  theoretical  results  of  general  interest ;  in  all  such  cases  the  number  of 
the  question  cited  is  given.  In  the  case  of  authors'  names,  citations  in 
the  text  are  given  first,  followed  by  citations  of  the  authors'  papers  or 
books  in  the  lists  of  references.] 


Ability,  general,  refs.,  392. 
Accident,  deaths  from  (law  of  small 

chances),  265-266. 
Accidents,  frequency-distributions, 

refs.,  396. 
Ac  hen  wall,    Gottfried,    Abriss  der 

Staatswissenschaft,  2. 
Adyanthaya,  N.  K.,  refs.,  sampling, 

399. 

Ages,  at  death  of  certain  women 
(table),  78  ;  of  husband  and  wife 
(correlation),  159  ;  diagram,  173  ; 
constants,  (qu.  3)  189. 

Aggregate,  of  classes,  10-11. 

Agricultural  labourers'  earnings.  See 
Earnings. 

Agriculture,  experiment,  errors  in, 
refs.,  401-404. 

Airy,  Sir  G.  B.,  use  of  terms  "  error 
of  mean  square  "  and  "  modulus," 
144.  Refs.,  Theory  of  Errors  of 
Observation,  360. 

Allan,  F.  E.,  refs.,  fitting  poly- 
nomials, 393. 

Ammon,  0.,  hair  and  eye-colour  data 
cited  from,  61. 

Analysis,  harmonic.  See  Harmonic 
analysis. 

Analysis  of  variance.  See  Variance. 
Analysis  Situs,  refs.,  Hotelling,  393. 
Anderson,  0.,  correlation  difference 

method,    198;    refs.,   208,  392, 

393. 

Annual  value  of  dwelling-houses 
(table),  83 ;  of  estates  in  1715, 
table,  100;  diagram,  101. 

Arithmetic  mean.  See  Mean,  arith- 
metic. 


Array,  def.,  164 ;  standard-devia- 
tion of,  177,  204-205,  236-237  ;  in 
normal  correlation,  319-321. 

Association,  generally,  25-59  ;  def., 
28  ;  degrees  of,  29-39  ;  testing  by 
comparison  of  percentages,  30-35  ; 
constancy  of  difference  from  in- 
dependence values  for  the  second- 
order  frequencies,  35-36  ;  co- 
efficients of,  37-39  ;  illusory  or 
misleading,  48-51  ;  total  possible 
number  of,  for  n  attributes,  54-56 ; 
case  of  complete  independence, 
56-57  ;  use  of  ordinary  correlation- 
coefficient  as  measure  of  asso- 
ciation, 216-217  ;  Pearson's  co- 
efficient based  on  normal  corre- 
lation (refs.),  40,  333  ;  refs.,  15, 
39-40,  333. 

Association,  partial,  generally,  42- 
59  ;  the  problem,  42-43  ;  total 
and  partial,  def.,  44  ;  arithmetical 
treatment,  44-48  ;  testing,  in 
ignorance  of  third-order  frequen- 
cies, 51-54  ;  refs.,  57. 

—  examples  :  inoculation  against 
cholera,  31-32,  34-35,  382-384; 
deaths  and  occupation,  52-53  ; 
deaf-mutism  and  imbecility,  32- 
33  ;  eye-colour  of  father  and  son, 
33-34  ;  eye-colour  of  grandparent, 
parent,  and  offspring,  46-48,  53 - 
54 ;  colour  and  prickliness  of 
Datura  fruits,  36-37,  377-378  ; 
defects  in  school  children,  45-46. 

Asymmetrical  frequency  -  distribu- 
tions, 90-102  ;  relative  positions 
of  mean,  median,  and  mode  in, 

r  27 
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121-122;  diagrams,  113-114.  See 
also  Frequency-distributions. 

Asymmetry  in  frequency-distribu- 
tions, measures  of,  107,  149-150. 

Attributes,  theory  of,  generally, 
1-59  ;  def.,  7  ;  notation,  9-10, 
14-15  ;  positive  and  negative,  10  ; 
order  and  aggregate  of  classes, 
10-11  ;  ultimate  classes,  12  ; 
positive  classes,  13-14  ;  consist- 
ence of  class-frequencies,  17-24 
(see  Consistence) ;  association  of, 
25-59,  377-381  {see  Association)  ; 
sampling  of,  254-334  {see  Sam- 
pling of  attributes). 

Averages,  generally,  106-132  ;  def., 

107  ;  desirable  properties  of,  107- 

108  ;  forms  of,  108  ;  average  in 
sense  of  arithmetic  mean,  109  ; 
refs.,  129-130.  See  Mean,  Median, 
Mode. 

Axes,  principal,  in  correlation,  321- 
322. 

Bachelier,  L.,  refs.,  Calcul  des 
probabilifes,  404  ;  Lejeu,  la  chance 
et  le  hasard,  404. 

Bailey,  M.  A.,  refs.,  cotton  trials,  402. 

Baker,  G.  A.,  refs.,  sampling,  400. 

Banister,  H.,  refs..  Elementary  Appli- 
cations of  Statistical  Method,  406. 

Barlow,  P.,  tables  of  squares,  etc., 
67  ;  refs.,  357. 

Barometer  heights,  table,  96  ;  dia- 
gram, 97  ;  means,  medians,  and 
modes,  122. 

Bateman,  H.,  refs.,  law  of  small 
chances,  273. 

Baten,  W.  D.,  refs.,  moments 
(correlation),  391. 

Bateson,  W.,  data  cited  from,  37, 
380-381. 

Beaven,  E.  S.,  refs.,  yield  trials,  402. 
Becker,  R.,  refs.,  Anivendungen  dcr 

math.  Statistik  auf  Probleme  der 

Masseyifabrikation,  404. 
Beetles    {Chrysomelidce),    sizes  of 

genera,  363-364. 
Beeton,  Miss  M.,  data  cited  from,  78. 
Behrens,  W.  U.,  refs.,  vield  trials, 

403. 

Bennett,  T.  L.,  refs.,  cost  of  livinsf, 
390. 

Berkson,  J.,  refs.,  Bayes'  Theorem, 
400. 


Bernoulli,  J.,  refs.,  Ars  Conjectandi,  ! 
360. 

Berry,  R.  A.,  refs.,  variation  in  ; 
mangels,  402  ;  errors  in  feeding  ' 
experiments,  401.  J 

Bertillon,  J.,  ref.,  Cours  elementaire  j 
de  statistique,  6,  360.  i 

Bertrand,  J.  L.  F.,  refs.,  Calcul  des  ' 
probabilites,  360.  i 

Betz,  W.,  ref.,  Ueber  Korr elation,  360.  i 

Bias  in  samplincr,  261-262,  279-281,  i 
336-337,  343,  353.  \ 

—  in  scale-reading,  362-363.  j 

Bielfeld,  Baron,  J.  F.  von,  use  of  \ 
word  "  statistics,"  1.  , 

Binomial  series,  291-300  ;    genesis  i 
of,    in    sampling    of  attributes, 
291-293  ;    calculated   series  for 
different  values  of  p  and  n,  294,  ; 
295  ;    experimental  illustrations 
of,  258,  259,  (qu.  1  and  qu.  2)  274,  ' 
371  ;  graphic  method  of  forming  ; 
a  representation  of  series,  295-  I 
297  ;  mechanical  method  of  form-  ' 
ing  a  representation  of  series,  297-  ; 
299  ;    refs.,  313 ;    direct  deter- 
mination of  mean  and  standard-  j 
deviation,  299-300  ;  deduction  of  j 
normal    curve    from,    301-302  ;  i 
refs.,  314.  ! 

Bispham,  J.  W.,  refs.,  errors  of  ' 
sampling  in  partial  correlations,  , 
397.  I 

Blakeman,  J.,  refs.,  tests  for  line-  | 
arity  of  regression,  209,  354  ;  prob-  ' 
able  error  of  contingency  co-  s 
efficient,  354.  "     '  j 

Boole,  G.,  refs.,  Laics  of  Thought,  23.  : 

Booth,  Charles,  on  pauperism,  193,  i 
195.  j 

Borel,  E.,  refs.,  Theorie  des  proha-  j 
bilites,  360.  j 

Bortkewitsch  (Bortkiewicz),  L.  von,  j 
law  of  small  chances.  265-266, 370: 
time-distributions,  389  ;  refs.,  law  ' 
of    small    chances,    273,    394 ;  : 
sampling,  400.  ' 

Bowley,  A.  L.,  refs.,  effect  of  errors  j 
on  an  average,  356  ;  on  sampling. 
354  ;  Measurement  of  Groups  and 
Series,  ZoA: ;  Elements  of  Statistics. 
360,  404;  Elementary  Manual  of 
Statistics,  360,  404  ;  cost  of  living, 
390  :  index  numbers,  391  ;  Prices  ^ 
and  Wages,  1914-20,  391.  j 
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Bowley,  A.  L.,  and  R.  L.  Connor, 

goodness  of  fit,  ref.,  396. 
Bravais,  A.,  refs.,  correlation,  188, 

332. 

British  Association,  data  cited  from, 
stature,  88 ;  weight,  95,  see 
Stature ;  Weight ;  Reports  on 
index-numbers  ;  refs.,  130-131  ; 
Address  by  A.  L.  Bowley  on  sam- 
pling, 354  ;  mathematical  tables, 
401. 

Brown,  J.  W.,  refs.,  index-correla- 
tions, 226,  252. 

Brown,  W.,  refs.,  effect  of  experi- 
mental errors  on  the  correlation- 
coefficient,  226  ;  The  Essentials  of 
Mental  Measurement,  360,  404. 

Brownlee,  J.,  refs.,  frequency  curves 
(epidemiology  and  random  migra- 
tion), 396. 

Bruns,  H.,  refs.,  Wahrscheinlich- 
keitsrechnung  und  Kollektivmass- 
lehre,  360. 

Brunt,  D.,  refs..  The  Combination  of 
Observations,  404. 

Burnside,  W.,  refs.,  Theory  of  Proba- 
bility, 406. 

Camp,  B.  H.,  refs.,  correlation,  393  ; 
integrals  for  point  binomial  and 
hypergeometric  series,  395 ;  sam- 
pling, 398. 

Cave,  Beatrice  M.,  correlation  differ- 
ence method,  198  ;  refs.,  208. 

Cave-Browne-Cave,  F.  E.,  correla- 
tion difference  method,  198  ;  refs., 
208. 

Census  (England  and  Wales),  tabu- 
lation of  infirmities  in,  14-15 ; 
data  as  to  infirmities  cited  from, 
32-33  ;  classification  of  occupa- 
tions, as  example  of  a  hetero- 
geneous classification,  72  ;  classi- 
fication of  ages,  80,  and  refs.,  105  ; 
data  as  to  ages  of  husbands  and 
wives  cited  from,  159. 

Chaddock,  R.  E.,  refs..  Principles 
and  Methods  of  Statistics,  406. 

Chance,  in  sense  of  complex  causa- 
tion, 30  ;  of  success  or  failure  of 
an  event,  256. 

Chances,  law  of  small,  265-266, 
366-370  ;  refs.,  273,  394. 

Charlier,  C.  V.  L.,  refs.,  theory  of 
frequency  curves,  resolution  of  a 


compound  normal  curve,  314,  315, 
395. 

Chebysheff,  P.  L.,  refs.,  fitting  poly- 
nomials {see  Isserlis,  L.),  393 ; 
means,  397  ;  inequality,  398 
(under  Camp),  400  (under  Smith, 
C.  D.). 

Childbirth,  deaths  in,  application  of 
theory  of  sampHng,  282-284. 

Chokhate,  J.    See  Shohat,  J, 

Cholera  and  inoculation,  illustra- 
tions, 31-32,  34-35,  382-384. 

Christidis,  B.  G.,  refs.,  yield  trials, 
403. 

Chrysomelidce,  distribution  of  size  of 

genus,  363-364. 
Church,  A.  E.  R.,  refs.,  probable 

errors,  398,  399. 
Clapham,  A.  R.,  refs.,  yield  trials, 

403. 

Class,  in  theory  of  attributes,  8 ; 
class  symbol,  9  ;  class-frequency, 
10  ;  positive  and  negative  classes, 
10  ;  ultimate  classes,  12  ;  order  of 
a  class,  10. 

Classification,  generally,  8  ;  by  di- 
chotomy, del.,  9  ;  manifold,  60- 
74,  76  ;  homogeneous  and  hetero- 
geneous, 71-72  ;  of  a  variable  for 
frequency-distribution  or  corre- 
lation table,  76,  80-81,  157,  164. 

Class-interval,  def.,  76 ;  choice  of 
magnitude  and  position,  79-80, 
362-363  ;  desirability  of  equality 
of  intervals,  76,  82-83  ;  influence, 
of  magnitude  on  mean,  113-114, 
115,  116  ;  on  standard  deviation, 
140,  212. 

Cloudiness  at  Breslau,  frequency- 
distribution,  103  ;  diagram,  104, 

Coefficient,  of  association,  37-39  ;  of 
contingency,  64-67  ;  of  variation, 
149,  standard  error,  351  ;  refs., 
distribution  in  sampling,  400  ;  of 
correlation,  see  Correlation. 

Collins,  S.  H.,  refs.,  agricultural  ex- 
periments, 401. 

Colours,  naming  a  pair,  example  of 
contingency,  379-380. 

Connor,  R.  L.    See  Bowley. 

Consistence,  of  class-frequencies  for 
attributes,  generally,  17-24  ;  def., 
18-19  ;  conditions,  for  one  or  two 
attributes,  20 ;  for  three  attri- 
butes, 21-22  ;  refs.,  23. 
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Consistence  of  correlation-coeffi- 
cients, 250-251. 

Contingency  tables,  def.,  60  ;  treat- 
ment of,  by  elementary  methods, 
61-63  ;  isotropy,  68-71,  328-331, 
testing  of  divergence  from  inde- 
pendence, 378-380. 

—  coefficient  of,  64-67  ;  application 
to  correlation  tables,  167,  (qu,  3) 
189 ;  standard  error  of  (refs.), 
355,  397,  399  ;  partial  or  multiple 
contingency  (refs.),  390. 

Contrary  classes  and  frequencies  (for 
attributes),  10  ;  case  of  equality 
of  contrary  frequencies  (qu.  6,  7, 
8),  16  ;  (qu.  8),  24  ;  (qu.  7,  8,  9), 
59. 

Correction  of  death-rates,  etc.,  for 
age  and  sex-distribution,  223- 
225  ;  refs.,  226,  392. 

—  of  standard-deviation  for  group- 
ing of  observations,  211-212 ;  refs. 
(including  correction  of  moments 
generally),  225. 

Correction  of  correlation-coefficient 
for  errors  of  observation,  213- 
214  ;  refs.,  225-226,  392. 

Correlation,  generally,  157-253;  con- 
struction of  tables,  164  ;  represen- 
tation of  frequency-distribution 
by  surface,  165-167  ;  treatment  of 
table  by  coefficient  of  contingency, 
167  ;  correlation-coefficient,  170- 
174,  def.  174,  direct  deduction, 
231-233 ;  regressions,  175-177, 
direct  deduction,  365-366,  def. 
175 ;  standard-deviations  of 
arrays,  177,  204,  205  ;  calculation 
of  coefficient  for  ungrouped  data, 
177-181,  for  a  grouped  table,  181- 
188  ;  between  movements  of  two 
variables,  difference  method,  197- 
199,  fluctuation  method,  199-201  ; 
refs.,  208-209,  360,  392,  393,  401  ; 
elementary  methods  for  cases  of 
non-linear  regression,  201-202  ; 
rough  methods  for  estimating  co- 
efficient, 202-204 ;  correlation- 
ratio,  204-207,  252 ;  effect  of 
errors  of  observation  on  the  co- 
efficient, 213-214 ;  correlation 
between  indices,  215-216 ;  co- 
efficient for  a  fourfold  table, 
direct,  216-217,  on  assumption  of 
normal  correlation  (Pearson's  co- 


efficient) (refs.),  40,  333,  390  ;  for  i 

all  possible  pairs  of  N  values,  217-  ! 

218  ;   correlation  due  to  hetero-  j 

geneity    of    material,    218-219 ;  j 

effect  of  adding  uncorrelated  pairs  j 

to  a  given  table,  219-220  ;  appli-  l 

cation  to  theory  of  weighted  mean,  \ 

221-223  ;  correlation  in  theory  of  1 
sampling,  271,  286-289,  342,  349- 
350  ;  standard  error  of  coefficient, 

352.    Refs.,  188,  208-209,  225-  j 
226,  390,  391,  392,  393,  397,  398, 
399,  400,  401,  406.    For  lUustra- 
tions.  Normal,  Partial,  Ratio,  see 
below. 

Correlation,  Illustrations  and  Ex-  j 

amples,  correlation  between  : —  | 

Two  diameters  of  a  shell  {Pec-  i 

te7i),  158  ;  constants  (qu.  3),  189.  I 

Ages  of  husband  and  wife,  159  ;  ' 
diagram,  173  ;  constants  (qu.  3), 
189. 

Statures  of  father  and  son,  160  ;  i 
diagrams,  facing  166,  174  ;  con-  ; 
stants  (qu.  3),  189  ;  correlation-  i 
ratios,  206-207 ;  testing  normality 
of  table,  322-328  ;  diagram  of  dia- 
gonal distribution,  325  ;  of  con-  ^ 
tour-lines  fitted  with  ellipses  of  < 
normal  surface,  327. 

Fertilitv  of  mother  and  daugh-  i 
ter,  161,  i 95-1 96  ;  diagram,  175  ; 
constants  (qu.  3),  189. 

Discount  rates  and  percentage  j 

of  reserves  on  deposits,  162  ;  dia-  j 

gram,  facing  166.  ) 

Sex-ratio  and  numbers  of  births  i 

in  different  districts,  163,  175 ;  j 

diagram,  176  ;  constants  (qu.  3),  j 

189  ;     correlation  -  ratios,    207  :  ■ 

standard -deviations  of  arrays  and  \ 

comparison  with  theorj'  of  sam-  ■ 
pling,  (qu.    7)  275  and  (qu.  1) 

289.^  ; 

Earnings  of  agricultural  labour-  ^ 
ers,  pauperism  and  out-relief,  177- 

181  ;  constants,  (qu.  2)  189.  239  ;  j 

correlation-ratios,  207  ;  treatment  j 
by  partial  correlation,  239-241  ; 

geometrical  representation,  245-  i 
247. 

Old-age   pauperism   and  out- 

relief,  182-185.  '. 

Changes    in    pauperism,    out-  ■ 

relief,  proportion  of  old  and  popu-  | 
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lation,  192-195  ;  partial  correla- 
tion, 241-245. 

Lengths  of  mother-  and 
daughter-frond  in  Lemna  minor, 
185-187. 

Weather  and  crops,  196-197. 

Movements  of  infantile  and 
general  mortality,  197-199. 

Movements  of  marriage -rate 
and  foreign  trade,  199-201. 
Correlation,  normal,  317-334  ;  de- 
duction of  expression  for  two 
variables,  318-319  ;  constancy  of 
standard -deviation  of  arrays  and 
linearity  of  regression,  319-320  ; 
contour  lines,  320-321  ;  normality 
of  linear  functions  of  two  nor- 
mally distributed  variables,  321  ; 
principal  axes,  321-322  ;  testing 
for  normality  of  correlation  table 
for  stature,  322-328  ;  isotropy  of 
normal  correlation  table,  328-331 ; 
outline  of  theory  for  any  number 
of  variables,  331-332  ;  coefficient 
for  a  normal  distribution  grouped 
to  fourfold  form  round  medians 
(Sheppard's  theorem),  (qu.  4)  334 ; 
applications  to  theory  of  quali- 
tative observations  (refs.),  333. 
Refs.,  332-333,  390,  391,  397. 
—  partial,  229-253  ;  the  problem, 
partial  regressions  and  correla- 
tions, 229-231  ;  direct  deduction, 
365-366  ;  notation  and  defini- 
tions, 233-234 ;  normal  equa- 
tions, fundamental  theorems  on 
product-sums,  234-235 ;  signi- 
ficance of  generalised  regressions 
and  correlations,  236  ;  reduction 
of  standard-deviation,  236-237;  of 
regression,  237-238  ;  of  correla- 
tion, 238  ;  arithmetical  treatment, 
238-245 ;  representation  by  a 
model,  245-247  ;  coefficient  of 
w-fold  correlation,  247-249  ;  ex- 
pression of  correlations  and  regres- 
sions in  terms  of  those  of  higher 
order,  249-250  ;  consistence  of  co- 
efficients, 250-251  ;  fallacies,  251- 
252;  limitations  in  interpretation  of 
the  partial  correlation-coefficient, 
partial  association  and  partial  cor- 
relation, 252  ;  partial  correlation 
in  case  of  normal  distribution  of 
frequency,  331-332  ;   refs.,  252- 


253,  332-333,  393,  394,  397, 
398. 

Correlation  ratio,  204-207  ;  standard 
error,  352  ;  refs.,  209,  398,  400  ; 
partial,  252,  and  refs.,  252,  393, 
398,  400. 

Cosin,  values  of  estates  in  1715,  100. 

Cost  of  living,  refs.,  390-391. 

Cotsworth,  M.  B.,  refs.,  multiplica- 
tion table,  358. 

Cournot,  A.  A.,  refs.,  theory  of  prob- 
ability, 361. 

Craig,  C.  C,  refs.,  sampling,  399, 
400. 

Cramer,  H.,  refs.,  series  used  in 
mathematical  statistics,  395 ; 
theory  of  error,  395. 

Crawford,  G.  E.,  refs.,  proof  that 
arithmetic  mean  exceeds  geo- 
metric, 130. 

Crelle,  A.  L.,  refs.,  multiplication 
table,  358. 

Crops  and  weather,  correlation,  196- 
197. 

Crum,  L.  W.,  refs..  Economic  Statis- 
tics, 405. 

Cunningham,  E.,  ref.,  omega-func- 
tions, 314. 

Czuber,  E.,  refs.,  Wahrscheinlich- 
keitsrechnung,  361  ;  Die  statis- 
tische  Forschungsmethode,  404. 

Darbishire,  a.  D.,  data  cited  from, 

128,  265 ;    refs.,  illustrations  of 

correlation,  188,  273. 
Darmois,  G.,  refs.,  time  series,  393  ; 

Statistique  Mathematique,  406. 
Darwin,  Charles,  data  cited  from, 

269-270. 

Datura,  association  between  colour 

and  prickliness  of  fruit,  37,  38, 

(qu.  10)  275,  380-381. 
Davenport,  C.  B.,  data  as  to  Pecten 

cited  from,  158.    Refs.,  statistical 

tables,  358. 
Day,  E.  E.,  refs.,  Statistical  Analysis, 

405. 

Deaf-mutism,  association  with  im- 
becility, 33-34,  38 ;  frequency 
amongst  offspring  of  deaf-mutes, 
table,  104. 

Deaths,  death-rates,  association  with 
occupation  (partial  correction  for 
age-distribution),  52-53  ;  in  Eng- 
land and  Wales,  1881-1890,  table, 
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77  ;  from  diphtheria,  table,  98, 
diagram,  97  ;  infantile  and  gene- 
ral, correlation  of  movements, 
197-199  ;  standardisation  of,  for 
age  and  sex-distribution,  52-53, 
223-225,  refs.,  226,  392  ;  applica- 
tions of  theory  of  sampling — 
deaths  from  accident,  265-266, 
deaths  in  childbirth,  282-284, 
deaths  from  explosions  in  mines, 
287-288 ;  inapplicability  of  the 
theory  of  simple  sampling,  260- 
261,  282-284,  285-286,  287-288  ; 
criteria  (refs.),  390. 
Deciles,  150-152  ;  standard  error  of, 
337-341. 

Defects  :  in  school  children,  associa- 
tion of,  12,  45-46,  refs.,  15  ;  cen- 
sus tabulation  of,  14-15. 

De  Morgan,  A.,  refs..  Formal  Logic, 
23  ;  Theory  of  Probabilities,  361. 

Detlefsen,  J.  A.,  refs.,  fluctuations 
of  sampling  in  Mendelian  popula- 
tion, 394. 

Deviation,  mean,  134 ;  generally, 
144-147  ;  def.,  144  ;  is  least  round 
the  median,  144-145  ;  refs.,  154  ; 
calculation  of,  145-146,  (qu.  7) 
155-156  ;  comparison  of  advan- 
tages with  standard-deviation, 
146 ;  of  magnitude  with  standard- 
deviation,  146-147  ;  of  normal 
curve,  304. 

Deviation,  quartile.    See  Quartiles. 

—  root-mean-square.  See  Devia- 
tion, standard. 

—  standard,  134-144  ;  def.,  134  ; 
relation  to  root-mean-square  de- 
viation from  any  origin,  134-135  ; 
is  the  least  possible  root-mean- 
square  deviation,  135  ;  little 
affected  by  small  errors  in  the 
mean,  135  ;  calculation  for  un- 
grouped  data,  135-137,  for  a 
grouped  distribution,  138-141  ; 
influence  of  grouping,  140,  211- 
212  ;  range  of  six  times  the  s.d. 
contains  the  bulk  of  the  observa- 
tions, 140-142,  309  ;  of  a  series 
compounded  of  others,  142-143  ; 
of  N  consecutive  natural  numbers, 
143  ;  of  rectangle,  143  ;  of  arraj^s 
in  theory  of  correlation,  177,  204, 
205,  319-320  ;  of  generalised  de- 
viations (arrays),  234,  236-237  ; 
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other  names  for,  144  ;  of  a  sum 
or  difference,  210-211  ;  effect  of 
errors  of  observation  on,  211  ;  of 
an  index,  214-215  ;  of  binomial 
series,  299-300  ;  of  law  of  small 
chances,  366-370.  For  standard- 
deviations  of  sampling,  see  Error, 
standard. 

De  Vries,  H.,  data  cited  from,  102. 

Dice,  records  of  throwing,  258-259, 
(qu.  1,  2,  3)  274,  371  ;  testing  for 
significance  of  divergence  from 
theory,  267,  373-376  ;  refs.,  273. 

Dickson,  J.  D.  Hamilton,  normal 
correlation  surface,  328.  Refs., 
normal  correlation,  333. 

Difference  method  in  correlation, 
197-199  ;  refs.,  226,  252,  392-393. 

Diphtheria,  ages  at  death  from, 
table,  98  ;  diagram,  97. 

Discounts  and  reserves  in  American 
banks,  table,  163 ;  diagram,  facing 
166. 

Dispersion,  measures  of,  107,  133- 

156 ;    unsuitability  of  range  as 

a  measure,  123  ;    relative,  149 ; 

refs.,  154.    See  Deviation,  mean  ; 

Deviation,  standard  ;  Quartiles. 
Distribution    of    Frequency.  See 

Frequency-distribution. 
Dodd,  E.  L.,  refs.,  frequency  curves,- 

395  ;  sampling,  397,  398. 
Doodson,  A.  T.,  refs.,  mode,  median, 

and  mean,  390. 
Duckweed,     correlation  between, 

mother-  and  daughter-frond,  185- 

187. 

Duffell,  J.  H.,  ref.,  tables  of  gamma- 
function,  358. 

Duncker,  G.,  relation  between  geo- 
metric and  arithmetic  mean  (qu. 
9),  156. 

Earnings  of  agricultural  labourers  : 
calculation  of  standard-deviation, 
135-137  ;  mean  deviation,  145  ; 
quartiles,  147  ;  correlation  with 
pauperism  and  out-relief,  177-181, 
constants,  (qu.  2)  189,  239  ;  dia- 
gram, 180  ;  by  partial  correlation, 
239-247  ;  diagram  of  model,  246. 

Eden,  T.,  refs.,"  yield  trials,  402  ; 
with  tea,  403. 

Edge  worth,  F.  Y.,  dice-thro  wings 
( Weldon),  258  ;  probable  error  of 
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median,  etc.,  344.  Refs.,  Index - 
numbers,  130-131,  391  ;  correla- 
tion, 188,  252,  333  ;  law  of  error 
(normal  law)  and  frequency- 
curves  generally,  273,  314,  395  ; 
theory  of  sampling,  probable 
errors,  etc.,  273,  354  ;  dissection 
of  normal  curve,  315. 

Elderton,  E.  M.,  refs.,  variate  differ- 
ence method,  392  ;  normal  curve 
tables,  395  ;  sampling,  399. 

Elderton,  W.  Palin,  refs.,  calculation 
of  moments,  154  ;  table  of  powers, 
358  ;  tables  for  testing  fit,  354, 
358  ;  Frequency  Curves  and  Cor- 
relation, 154,  361,  404. 

Engineering,  applications  of  statis- 
tical method,  refs.,  404. 

Engledow,  F.  L.,  refs.,  yield  trials, 
402. 

Epidemiology,  applications  of  statis- 
tical method  to,  refs.,  396. 

Error,  law  of  ;  errors,  curve  of.  See 
Normal  curve. 

—  mean,  144. 

—  mean  square,  144. 

—  of  mean  square,  144. 

—  probable,  in  sense  of  semi-inter- 
quartile range,  147  ;  in  theory  of 
sampling,  310-311.  For  general 
references,  see  Error,  standard. 

—  standard,  def.,  267  ;  of  number 
or  proportion  of  successes  in  n 
events,  256-257  ;  when  numbers 
in  samples  vary,  264-265  ;  when 
chance  of  success  or  failure  is 
small,  265-266 ;  of  percentiles 
(median,  quartiles,  etc.),  337-341  ; 
of  arithmetic  mean,  344-350  ;  of 
standard-deviation  and  coefficient 
of  variation,  351  ;  of  coefficients 
of  correlation  and  regression,  352  ; 
of  correlation-ratio  and  test  for 
linearity  of  regression,  352  ;  refs., 
273,  289,  354-355,  397-401.  See 
also  Sampling,  theory  of. 

—  theory  of.  See  Sampling,  theory 
of. 

Estates,  annual  value  of.  See  Value. 

Everitt,  P.  F.,  refs.,  tables  for  calcu- 
lating Pearson's  coefficient  for  a 
fourfold  table,  358. 

Exclusive  and  inclusive  notations  for 
statistics  of  attributes,  14-15. 

Explosions   in   coal-mines,  deaths 


from,  as  illustrating  theory  of 
sampling,  288. 
Eye-colour,  association  between 
father  and  son,  34-35,  38,  70-71  ; 
association  between  grandparent, 
parent,  and  child,  46-48,  53-54  ; 
contingency  with  hair-colour,  61, 
63,  66-68  ;  non-isotropy  of  con- 
tingency table  for  father  and  son, 
70-71. 

Ezekiel,  Mordecai,  refs.,  correlation, 
393,  394  ;  sampling,  400  ;  MetJwds 
of  Correlation  Analysis,  406. 

Falkner,  R.  p.,  refs.,  translation  of 
Meitzen's  Theorie  der  Statistik,  6. 

Fallacies,  in  interpreting  associations 
— theorem  on,  48-49,  illustrations, 
49-51  ;  owing  to  changes  of  classi- 
fication, actual  or  virtual,  72  ;  in 
interpreting  correlations — "  spuri- 
ous "  correlation  between  indices, 
215-216 ;  correlation  due  to 
heterogeneity  of  material,  218- 
219  ;  difference  of  sign  of  total 
and  partial  correlations,  251-252. 

Fay,  E.  A.,  data  cited  from  Mat'- 
riages  of  the  Deaf  in  A  merica,  104. 

Fechner,  G.  T.,  refs.,  frequency-dis- 
tributions, averages,  measures  of 
dispersion,  etc.,  129,  154 ;  Kol- 
lektivmasslehre,  129,  314,  361. 

Fecundity  of  brood-mares,  table,  96 ; 
diagram,  94  ;  mean,  median,  and 
mode,  (qu.  3)  131  ;  inheritance 
(ref.),  208,  226. 

Feeding  trials,  errors  in,  refs.,  401- 
404. 

Fertility  of  mother  and  daughter, 
correlation,  161,  195-196 ;  dia- 
gram, 175  ;  constants,  (qu.  3)  189; 
ref.,  208,  226. 

Field  trials,  errors  in,  ref.,  401-404. 

Filon,  L.  N.  G.,  ref.,  probable  errors, 
354. 

Fisher,  A.,  refs..  Mathematical  Theory 

of  Probabilities,  404. 
Fisher,  Irving,  refs.,  index-numbers, 

390,  391. 

Fisher,  R.  A.,  use  of  terra 
"  variance,"  144  ;  testing  good" 
ness  of  fit,  378,  387  ;  refs.,  good- 
ness  of  fit,  396,  397  ;  of  regression 
lines,  391  ;  errors  of  sampling  in 
correlation-coefficient,   354,  397, 
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399  ;  probable  errors,  397,  398, 
399,  400 ;  extremes  of  sample, 
399  ;  yield  trials,  402,  403  ; 
Statistical  Methods  for  Research 
Workers,  405. 

Fit  of  a  theoretical  to  an  actual 
frequency  -  distribution,  testing, 
generally,  370-389  ;  comparison 
frequencies  given  a  priori,  370- 
378  ;  cautions,  373-376  ;  experi- 
mental illustration,  377-378 ;  com- 
parison frequencies  based  on  the 
observations,  378-389 ;  contin- 
gency tables,  378-380  ;  associa- 
tion tables,  380-383  ;  aggregate 
of  tables,  383-384  ;  experimental 
illustrations,  384-387  ;  P-table 
for  use  with  association  tables, 
388-389  ;  refs.,  315,  391,  396-397 ; 
tables  for,  358. 

Fluctuation,  measure  of  dispersion, 
144. 

Flux,  A.  W.,  refs.,  measurement  of 

price -changes,  390. 
Forcher,  H.,  refs.,  Die  statistische 

Methode  als  selbstdndige  Wissen- 

schaft,  404. 
Fountain,  H.,  ref.,  index-numbers  of 

prices,  131. 
Frequency  of  a  class,  10,  76. 
Frequency-curve,   def.,  87  ;  ideal 

forms  of,  87-105  ;   normal  curve 

{q.v.},  301-313;   refs.,  105,  314, 

394-396. 

Frequency-distributions,  76 ;  forma- 
tion of,  79-83  ;  graphic  represen- 
tation of,  83-87  ;  ideal  forms — 
symmetrical,  87-90,  moderately 
asymmetrical,  90-98,  extremely 
asymmetrical  (J-sliaped),  98-102, 
363-364,  U-shaped,  102-105  ;  bi- 
nomial series,  291-300 ;  hyper- 
geometrical  series  (ref.),  289  ;  nor- 
m_al  curve,  301-313  ;  theoretical 
forms,  refs.,  289,  314,  394-396; 
testing  goodness  of  fit,  373-376. 
See  Binomial  series ;  Normal 
curve  ;  Correlation,  normal. 

■ —  illustrations  :  of  death-rates  in 
England  and  Wales,  77  ;  of  ages 
at  death  of  certain  women,  78  ;  of 
stigmatic  rays  on  poppies,  78  ;  of 
annual  values  of  dwelling-houses 
in  Great  Britain,  83  ;  of  head- 
breadths  of  Cambridge  students, 
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84  ;  of  statures  of  males  in  the 
U.K.,  88,  90  ;  of  pauperism  in 
different  districts  of  England  and 
Wales,  93  ;  of  weights  of  males  in 
the  U.K.,  95  ;  of  fecundity  of 
brood-mares,  96 ;  of  barometer 
heights  at  Southampton,  96  ;  of 
ages  at  death  from  diphtheria,  98  ; 
of  annual  values  of  estates,  100  ; 
of  petals  in  Ranunculus  bulhosus, 
102  ;  of  degrees  of  cloudiness  at 
Breslau,  103  ;  of  percentages  of 
deaf-mutes  in  offspring  of  deaf- 
mutes,  ,  104 ;  sizes  of  genera 
{Chrysomelida),  364.  See  also 
Correlation,  illustrations  and 
examples. 
Frequencv-polvgon,  construction  of, 
84. 

Frequency-surface,  forms  and  ex- 
amples of,  164-167  ;  diagrams, 
166,  facing  166  ;  normal,  diagram, 
166.    See  Correlation,  normal. 

Frisch,  Ragnar,  refs.,  correlation, 
391  :  time  series,  393. 

Fry,  T.  C,  refs.,  Probability  and  its 
Engineering  Uses,  404. 

Gabaglio,  a.,  ref.,  Teoria  generale 
della  statistica,  6. 

Galloway,  T.,  ref.,  Treatise  on  Prob- 
ability, 361. 

Galton,  Sir  Francis,  Hereditary 
Genius,  3  ;  frequency-distribution 
of  consumptivity,  104  ;  grades 
and  percentiles,  isO,  152  :  regres- 
sion, 176  ;  Galton's  function  (cor- 
relation -  coefficient),  204  :  bi- 
nomial machine,  299  ;  normal 
correlation,  328  ;  data  cited  from,- 
34,  46,  70.  Refs.,  geometric  mean, 
130 ;  percentiles,  154  ;  correla- 
tion, 188,  332  ;  correlation  be- 
tween indices,  226 ;  binomial 
machine,  313 ;  Natural  Inherit- 
ance, 154,  313,  332. 

Gamma  functions,  tables,  refs., 
401. 

Gauss,  C.  F.,  use  of  term  "  mean 
error,"  144.  Refs.,  normal  curve, 
314  ;  method  of  least  squares,  361. 

Geary,  R.  C,  refs.,  frequency'  distri- 
butions, 395. 

Geiger,  H.,  refs.,  law  of  small 
chances,  269. 
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Geometric  mean.  See  Mean,  geo- 
metric. 

Geometric  (logarithmic)  mode,  128. 
Gibbs,    J.    Willard,    Principles  of 

Statistical  Mechanics,  4. 
Gibson,  Winifred,  refs.,  Tables  for 

computing  probable  errors,  354, 

358. 

Gini,  C,  refs.,  index-numbers,  391. 
Goodness  of  fit,  generally,  370-389  ; 

refs.,  391,  396-397.  See  also  Fit. 
Grades,  152,  153. 

Graphic  method,  of  representing 
frequency-distributions,  83-87  ;  of 
interpolation  for  median  or  per- 
centiles, 118,  151-152  ;  of  repre- 
senting correlation  between  two 
variables,  180-181  ;  of  estim-ating 
correlation  -  coefficient,  203-204  ; 
of  forming  one  binomial  polygon 
from  another,  295-297. 

Graunt,  John,  ref.,  Observations  on 
the  Bills  of  Mortality,  6. 

Gray,  John,  data  cited  from,  270. 

Greatest  and  least  values  of  sample, 
refs.,  398  (Dodd),  399  (Fisher  and 
Tippett). 

Greenwood,  M.,  refs.,  index  correla- 
tions, 226,  252  ;  errors  of  sam- 
pling (small  samples),  289,  398  ; 
inoculation  statistics  and  associa- 
tion, 40  ;  application  of  law  of 
small  chances,  394 ;  multiple 
happenings,  396. 

Grindley,  H.  S.,  refs.,  errors  of  feed- 
ing trials,  402. 

Grouping  of  observations  to  form 
frequency-distribution,  choice  of 
class-interval,  79-80 ;  influence 
on  mean,  113-114,  115,  116;  in- 
fluence on  standard-deviation, 
140,  212. 

Grubb,  N.  H.,  refs.,  error  in  currant 
trials,  402. 

Gumbel,  E.  J.,  refs.,  spurious  corre- 
lation, 392. 


Hair-colour  :  and  eye-colour,  ex- 
ample of  contingency,  61-63,  66- 
67  ;  non-isotropy,  68,  69  ;  theory 
of  sampling  applied  to  certain 
data,  270-271,  272. 

Hall,  A.  D.,  refs.,  errors  of  agri- 
cultural experiment,  401,  402. 


Hall,  Philip,  refs.,  partial  correlation, 
394  ;  probable  errors,  398. 

Harmonic  analysis,  sampling,  refs., 
399  {see  Fisher,  R.  A.). 

Harmonic  mean.  See  Mean,  har- 
monic. 

Harper,  F.  H.,  refs.,  Practical 
Statistics,  406. 

Harris,  J.  A.,  refs.,  short  method  of 
calculating  coefiicient  of  correla- 
tion, 209  ;  intra-class  coefficients, 
209  ;  correlation,  miscellaneous, 
392  ;  error  in  field  experiments, 
401. 

Hart,  B.,  refs.,  effect  of  errors  on 

correlation,  392. 
Hatton,  R.  G.,  refs.,  error  in  currant 

trials,  402. 
Hayes,  H.  K.,  refs.,  variety  trials, 

402,  403. 

Head-breadths  of  Cambridge  stu- 
dents, table,  84  ;  diagram,  85. 

Helguero,  F.  de,  refs.,  dissecting 
compound  normal  curve,  315. 

Henry,  A.,  refs..  Calculus  and  Prob- 
ability, 404. 

Heron,  D.,  refs.,  association,  40  ;  re- 
lation between  fertility  and  social 
status,  208 ;  defective  physique 
and  intelligence,  application  of 
correction  for  age-distribution, 
etc.,  226  ;  abac  giving  probable 
errors  of  correlation  -  coefficient, 
354,  358 ;  probable  error  of  a 
partial  correlation-coefiicient,  354. 

Histogram,  construction  of,  84. 

History,  refs.,  of  statistics  generally, 
5-6,  390  ;  of  correlation,  188,  391  ; 
of  normal  curve,  395. 

Hoblyn,  T.  L.,  refs.,  horticultural 
experiment,  403. 

HoUis,  T.,  cited  re  Cosin's  Names  of 
the  Roman  Catholics,  etc.,  100. 

Holzinger,  K.  S.,  refs.,  sampling,  399. 

Hooker,  R.  H.,  correlation  between 
weather  and  crops,  196  ;  between 
movements  of  two  variables,  200, 
201.  Refs.,  correlation  between 
movements  of  two  variables,  208  ; 
weather  and  crops,  208,  253  ; 
theory  of  partial  correlation,  252. 

Horticulture,  errors  in,  refs.,  401- 
404. 

Hotelling,  Harold,  refs.,  history, 
390  ;  Analysis  Situs,  393  ;  time 
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series,  393  ;  probable  errors,  398, 
400. 

Houses,  inhabited  and  uninhabited, 
in  rural  and  urban  districts,  61-62; 
annual  value  of,  table,  83;  median, 
(qu.  4)  131  ;  quartiles,  (qu.  3)  155. 

Hubback,  J.  A.,  refs.,  rice  trials, 
402. 

Hudson,  H.  P.,  refs.,  frequency- 
curves  (epidemiology),  396. 

Hull,  C.  H.,  ref.,  The  Economic 
Writings  of  Sir  William  Petty, 
together  with  the  observations  on 
the  Bills  of  Mortality  more  probably 
by  Captain  John  Graunt,  6. 

Husbands  and  wives,  correlation  be- 
tween ages,  table,  159  ;  diagram, 
173  ;  constants,  (qu.  3)  189. 

Hypergeometrical  Series,  ref.,  289. 

Illusory  associations,  48-51. 

Imbecihty,  associations  with  deaf- 
mutism,  32-33,  38. 

Immer,  F.  R.,  refs.,  field  trials,  403. 

Inclusive  and  exclusive  notations  for 
statistics  of  attributes,  14-15. 

Independence,  criterion  of,  for  attri- 
butes, 25-28  ;  case  of  complete, 
for  attributes,  56-57  ;  form  of 
contingency  or  correlation  table 
in  case  of,  71  ;  goodness  of  fit  test 
for,  378-387. 

Index-numbers  of  prices,  def.,  126  ; 
use  of  geometric  mean  for,  126- 
127  ;  use  of  harmonic  mean,  129  ; 
refs.,  130-131,  390-391. 

Indices,  correlation  between,  215- 
216  ;  refs.,  226,  252,  392. 

Infirmities,  census  tabulation  of, 
14-15  ;  association  between  deaf- 
mutism  and  imbecility,  32-33,  38. 

Inoculation,  cholera,  examples,  31- 
32,  34-35,  382-384. 

Intermediate  observations,  in  a 
frequency-distribution,  classifica- 
tion of,  80-81,  362-363  ;  in  corre- 
lation table,  164. 

Irwin,  J.  0.,  refs.,  analysis  of  vari- 
ance, 394  ;  goodness  of  fit,  396  ; 
probable  errors,  etc.,  398,  399, 
400  ;  recent  advances,  401. 

Isotropy,  def.,  68  ;  generally,  67-71 ; 
of  normal  correlation  table,  328- 
331 ;  refs.,  73. 

Isserlis,  L.,  refs.,  partial  correlation- 
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ratio,  252,  393  ;  conditions  for  \ 
real  significance  of  probable  errors, 

354  ;  fitting  polynomials  (Cheby-  i 

sheff),   393  ;    probable  error  of  ; 

mean,  397  ;    small  samples  {see  j 

Greenwood),  398.  | 

I 

Jacob,  S.  M.,  ref.,  crops  and  rainfall, 
208,  226. 

Jefiery,  G.  B.,  refs.,  sampling,  399.  ] 

Jevons,  W.  Stanley,  use  of  geometric  ; 

mean,   127.      Refs.,   system  of  ' 

numerically    definite    reasoning  ! 

(theory  of  attributes),  15  ;  index-  . 

numbers,  130 ;    Pure  Logic  and  i 

other  Minor  Works,  15  ;  Investiga-  1 

Hons  in  Currency  and  Finance,  ! 

130.  ! 

Johannsen,  W.,  refs.,  Elemente  der  < 

exakten  Erblichkeitslehre,  361.  j 

John,  v.,  refs.,  Geschichte  der  Sta-  > 

tistik,  5.  \ 

Jones,  D.  C,  refs.,  A  First  Course  in  j 

Statistics,  404. 

Jordan,  C,  refs.,  time  series,  393 ;  , 

Statistique  Mathematique,  406. 
Jorgensen,   M.,   refs.,  agricultural 

experiment,  403. 

J-shaped    frequency  -  distributions,  ; 

98-102,  363-364. 

Julin,  A.,  refs.,  Principes  de  Statis-  j 

tique,  405. 

Kapteyn,  j.  C,  refs.,  Skew  Fre- 
quency-curves in  Biology  and  Sta-  \ 
tistics,  130,  314. 

Kelley,  T.  L.,  refs.,  correlation,  393, 
394  ;  Statistical  Method,  405. 

Keynes,  J.  M.,  refs.,  A  Treatise  on 

Probability,  405.  j 

Kick  of  a  horse,  deaths  from,  follow-  j 

ing  law  of  small  chances,  265-266,  - 

369-370.  ; 

Kindermann,  M.,  refs.,  yield  trials,  * 

403.  • 

King,  George,  refs.,  graduation  of 
age  statistics,  105. 

Knibbs,  G.  H.,  refs.,  price  index- 
numbers,  390  ;  frequencv-curves,  ; 
396.                             ■  ' 

Knight,  R.  C,  refs..  error  in  currant 

trials,  402.  | 

Kohlweiler,  E.,  refs.,  Staiistik  itn  j 

Dienste  der  Technik,  404.  | 
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Kohn,  S.,  refs.,  Theory  of  Statistical 

Method,  406. 
Kondo,    T.,    refs.,    normal  curve 

tables,  395  ;  sampling,  399,  400. 
Koren,  J.,  refs.,  History  of  Statistics, 

390. 

Labour  Gazette,  Index  Number,  refs., 
391. 

Labourers,  earnings  of  agricultural. 
See  Earnings. 

Laplace,  Pierre  Simon,  Marquis  de, 
probable  error  of  median,  344. 
Refs.,  normal  curve,  314  ;  mean 
deviation  least  about  the  median, 
154  ;  Theorie  analytique  des  proba- 
bilites,  154,  354,  361  ;  Essai  philo- 
sophique,  361,  405. 

Larmor,  Sir  J.,  use  of  word  "  statis- 
tical," 4. 

Lee,  Alice,  data  cited  from,  96,  122, 
160,  161.  Refs.,  inheritance  of 
fertility  and  fecundity,  208,  226  ; 
tables  of  functions,  358,  359. 

Lemna  miyior,  correlation  between 
lengths  of  mother-  and  daughter- 
frond,  185-187. 

Lexis,  W.,  use  of  term  "  precision," 
144.  Refs,,  Theorie  der  Massen- 
erscheinungen,  213  ;  Abhandlungen 
zur  Theorie  der  Bevolkerungs-  und 
Moralstatistik,  273,  361. 

Linearity  of  regression,  test  for, 
205-206,352;  refs.,  391.  Seealso 
Correlation-ratio, 

Lipps,  G.  F.,  refs,,  measures  of 
dependence  (association,  correla- 
tion, contingency,  etc.),  40  ; 
Fechner's  Kollektivmasslehre,  129, 
360. 

Little,  W,,  data  as  to  agricultural 

labourers'  earnings  cited  from,  137, 
Lloyd,  W.  E.,  refs,,  error  in  soil 

surveys,  402, 
Lobelia,  application  of  theory  of 

sampling  to  certain  data,  269-270, 

272. 

Logarithmic  increase  of  population, 
125-126  ;  logarithmic  mode,  128. 
Lord,  L,,  refs,,  rico  trials,  402, 
Lyon,  T.  L,,  refs,,  errors  of  agri- 
cultural experiment,  402. 

Macalister,  Sir  Donald,  ref,,  law 
of  geometric  mean,  130,  314. 


Macaulay,  F.  G.,  refs.,  smoothing 

time  series,  393, 
Macdonell,  W,  R.,  data  cited  from, 

84,  90. 

March,  L,,  refs,,  correlation,  208 ; 
index-numbers,  391  ;  Les  Prin- 
cipes  de  la  Methode  Statistique,  408. 

Marriage-rate  and  trade,  correlation 
of  movements,  199-201, 

Marshall,  A.,  ref,.  Money  Credit  and 
Commerce,  391. 

Maskell,  E.  J,,  refs,,  experimental 
error  in  agriculture^  403  ;  sugar 
cane,  403, 

Maxwell,  Clerk,  use  of  word  "  sta- 
tistical," 4. 

McKay,  A,  T,,  refs,,  sampling,  400. 

McNemar,  Q.,  refs.,  partial  correla- 
tion, 394. 

Mean,  arithmetic,  generally,  108- 
116;  def.,  108-109;  nature  of, 
109  ;  calculation  of,  for  a  grouped 
distribution,  109-113  ;  influence 
of  grouping,  113-114,  115,  116; 
position  relatively  to  mode  and 
median,  121-122,  (refs.)  390  ;  dia- 
grams, 113,  114;  sum  of  devia- 
tions from,  is  zero,  114  ;  of  series 
compounded  of  others,  115  ;  of 
sum  or  difference,  115-116  ;  com- 
parison with  median,  119;  sum- 
mary comparison  with  median  and 
mode,  mean  is  the  best  for  all 
general  purposes,  122-123;  weight- 
ing of,  220-225;  of  binomial 
series,  299 ;  of  law  of  small 
chances,  369  ;  standard  error  of, 
334-350,  (refs.)  355,  397-401. 

—  deviation.    See  Deviation,  mean, 

—  error,  144.  See  Error,  standard  ; 
Deviation,  standard. 

—  geometric,  108  ;  generally,  123- 
128  ;  def.,  123  ;  calculation,  124  ; 
less  than  arithmetic  mean,  123  ; 
difference  from  arithmetic  mean 
in  terms  of  dispersion,  (qu,  8)  156  ; 
of  series  compounded  of  others, 
124  ;  of  series  of  ratios  or  pro- 
ducts, 124  ;  in  estimating  inter- 
censal  populations,  125-126;  con- 
venience for  index-numbers,  126- 
127  ;  use  on  ground  that  devia- 
tions vary  with  absolute  magni- 
tude, 127-128  ;  weighting  of,  225. 

—  harmonic,  108  ;   generally,  128- 
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129  ;  def.,  128  ;  calculation,  128  ; 
is  less  than  arithmetic  and  geo- 
metric means,  129  ;  difference 
from  arithmetic  mean  in  terms  of 
dispersion,  (qu.  9)  156 ;  use  in 
averaging  prices  if  index-numbers, 
129  ;  in  theory  of  samphng,  when 
numbers  in  samples  vary,  264- 
265. 

Mean  square  error,  144. 

—  weighted,  220-225  ;  def.,  220  ; 
difference  between  weighted  and 
unweighted  means,  221-223  ;  ap- 
plication of  weighting  to  correc- 
tion of  death-rates,  etc.,  for  age- 
and  sex-distribution,  223-225  ; 
refs.,  226,  392. 

Median,  108;  generally,  116-120; 
def.,  116;  indeterminate  in  cer- 
tain cases,  116-117  ;  unsuited  to 
discontinuous  observations  and 
small  series,  116-117  ;  calculation 
of,  117  ;  graphical  determination 
of,  118;  comparison  with  arith- 
metic mean,  119;  advantages  in 
special  cases,  119-120;  slight  in- 
fluence of  outlying  values  on,  120  ; 
position  relatively  to  mean  and 
mode,  121-122,  diagrams,  113, 
114,  (refs.)  387  ;  weighting  of, 
225  ;  standard  error  of,  337-341, 
(refs.)  354. 

Meidell,  H.  B.,  refs.,  sampling,  398. 

Meitzen,  P.  A.,  refs.,  Geschichfe, 
Theorie  und  Technik  der  Sfa- 
tistik,  6. 

Mendelian  breeding  experiments  as 
illustrations,  37,  38,  128,  264-265, 
267-268;  refs.,  fluctuations  of 
sampling  in,  273,  394. 

Mercer,  W.  B.,  refs.,  errors  of  agri- 
cultural experiment,  402. 

Methods,  statistical,  purport  of,  3-5  ; 
def.,  5. 

Mice,  numbers  in  litters,  harmonic 
mean,  128-129 ;  proportions  of 
albinos  in  litters,  fluctuations 
compared  with  theory  of  sam- 
pling, 264-265. 

Migration,  random,  refs.,  396. 

Milk-testing,  errors  in,  refs.,  401. 

Milton,  John,  use  of  word  "  statist," 
1. 

Miner,  J.  R.,  correlation,  ref.,  393. 
Mises,  R..  von,  refs.,  Wahrscheinlich- 
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j 

keit,  Statistik  und  Wahrheit,  406  ;  • 

W ahrschein  lichkeitsrechnung,  407 .  j 

Mitchell,  H.  H.,  refs.,  errors  of  feed-  ; 

ing  trials,  402.  j 

Mitscherlich,  E.  A.,  refs.,  yield  trials,  \ 

403.  ] 

Mode,    108  ;    generaUy,    120-123  ;  ; 

def.,    120 ;    approximate   deter-  : 

mination,  from  mean  and  median,  : 

121-122;  diagrams  showing  posi-  ' 
tion    relatively    to    mean  and 
median,   113,   114:  logarithmic 
or  geometric  mode,  128  :  weight- 
ing of,  225  ;  refs.,  130,  390. 

Modulus  as  measure  of  dispersion,  ! 

144  ;   origin  from  normal  curve,  ■ 

304.  ; 

Mohl,  Robert  von,  refs.,  Geschichte 

und    Literatur    der  Statswissen- 

schaften,  5.  ! 

Moir,    H.,    refs.,    f  requeue  v-curves  1 

(mortality),  396. 

Molina,  E.  C,  refs.,  Bayes'  Theorem.  I 

401.  I 

MoUer-Arnold,  E.,  refs.,  field  trials,  ] 

403.  i 
Moment,  first,  def.,  110  ;  second  and 

general,  def.,  135  ;  calculation  of  j 

moments,  (ref.)  154 ;    errors  of  j 

sampling,  354-356,  397-401. 
Montessus  de  Ballore,  R.  de,  refs., 

ProbahiUtes  et  Statistiquc<i,  407.  ; 

Moore,  L.  Bramley,  data  cited  from,  \ 

96,161.    Refs.,  inheritance  of  fer-  ! 

tility  and  fecundity,  208,  226.  ! 

Morant,   G.,   refs.,   law   of   small  • 

chances,  394.  ; 

Mortality.    See  Death-rates.  \ 

Movements,  correlation  of,  in  two  i 

variables,  methods,  197-201 ;  refs.,  1 

208,  392-393.  j 

Negative  classes  and  attributes,  10.  - 

Newbold,  Ethel  M.,  refs.,  frequency- 
distributions,  accidents,  396. 

Newsholme,  A.,  refs.,  birth-rat'es, 
correction    for  age-distribution, 

etc.,  226  :  Vital  Statistics,  359,  408.  j 

Neyman,  J.,  refs.,  goodness  of  fit,  1 
397;    probable  errors,  398,  399, 

400  ;  yield  trials,  403.  ' 

Niceforo,  A.,  refs.,  La  Methode  Sta-  ; 

iistique,  405.  I 

Nixon,  J.  W.,  refs.,  experimental  ^ 

test  of  normal  law,  314.  j 
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Normal  curve  of  errors  ;  deduction 
from  binomial  series,  301-302  ; 
value  of  central  ordinate,  304  ; 
table  of  ordinates,  303  ;  mean 
deviation  and  modulus,  304  ;  com- 
parison with  binomial  series  for 
moderate  value  of  n,  304-305  ; 
outline  of  more  general  methods  of 
deduction,  305-307  ;  fitting  to  a 
given  distribution,  307-308  ;  the 
table  of  areas,  310,  and  its  use, 
309-310  ;  quart ile  deviation  and 
probable  error,  310-311  ;  numeri- 
cal examples  of  use  of  tables,  311- 
313  ;  normality  in  fluctuations  of 
sampling  of  the  mean,  346-347. 
Refs.,  general,  314  ;  dissection  of 
compound  curve,  315 ;  tables, 
358-359,  395,  401  ;  history,  395. 
For  normal  correlation,  see  Corre- 
lation, normal. 

Norton,  J.  P.,  data  cited  from,  162. 
Ref.,  Statistical  Studies  in  the  New 
York  Moyiey  Market,  208. 

Nybelle,  H.  C,  refs.,  Theorie  der 
Statistik,  406. 

O'Brien,  D.  G.,  refs.,  errors  in  feed- 
ing experiments,  401. 

Order,  of  a  class,  10  ;  of  generalised 
correlations,  regressions,  devia- 
tions, and  standard  deviations, 
233-234. 

Palgrave,  Sir  R.  H.  I.,  Dictionary 

oj  Political  Economy,  6. 
Papadakis,  J.,  refs.,  yield  trials,  403. 
Pareto,  V.,  refs.,  Cours  d'economie 

politique,  105. 
Partial  association.  See  Association, 

partial. 

—  correlation.  See  Correlation, 
partial. 

Patton,  A.  C,  refs.,  Economic  Sta- 
tistics, 405. 

Pauperism,  in  England  and  Wales, 
table,  93  ;  diagrams,  92,  113  ;  cal- 
culation of  mean.  111  ;  of  median, 
117,  118;  means,  medians,  and 
modes  for  other  years,  122  ;  stan- 
dard-deviation, 138-140 ;  mean 
deviation,  145-146;  quartiles, 
148;  percentiles,  151-152. 

—  correlation  with  out-relief,  182- 
185  ;  with  earnings  and  out-relief, 


177-181.  (qu.  2)  189,  239-241, 
245-247  ;  with  out-relief,  propor- 
tion of  aged,  etc.,  192-195,  241- 
245. 

Pearl,  Raymond,  normal  distribu- 
tion of  number  of  seeds  in  Nehim- 
bium,  306.  Refs.,  probable  errors, 
355  ;  errors  in  variety  tests,  402  ; 
Introduction  to  Medical  Biometry, 
405. 

Pearson,  E.  S.,  refs.,  polychoric 
coefficients,  390;  goodness  of  fit, 
397;  probable  errors,  398,  399,400. 

Pearson,  Karl,  contingency,  63,  65  ; 
mode,  120  ;  standard -deviation, 
144  ;  coefficient  of  variation,  149  ; 
skewness,  149  ;  inheritance  of 
fertility,  195 ;  spurious  correla- 
tion between  indices,  215  ;  bi- 
nomial apparatus,  299  ;  deduction 
of  normal  curve,  303  ;  data  cited 
from,  70,  78,  90,  96,  122,  160,  161. 
Refs.,  correlation  of  characters  not 
quantitatively  measurable,  40, 
333  ;  contingency,  etc.,  72-73, 
333,  390,  397  ;  frequency-curves, 
105,  130,  154,  273,  289,  314,  315, 
354,  395  ;  binomial  distribution 
and  machine,  314  ;  hypergeomet- 
rical  series,  289  ;  dissection  of 
compound  normal  curve,  315  ; 
calculation  of  moments,  225  ; 
general  methods  of  curve-fitting, 
209  ;  testing  fit  of  theoretical  to 
actual  distribution,  315,  391,  396  ; 
correlation  and  correlation-ratio, 
188,  209,  225,  252,  333,  390,  391, 
392,  393,  397  ;  fitting  of  principal 
axes  and  planes,  209,  333  ;  corre- 
lation between  indices,  226 ; 
inheritance  of  fertility,  226  ; 
weighted  mean,  reproductive  se- 
lection, 226  ;  probable  errors,  355, 
394,  397,  398,  399,  401  ;  tables 
for  statisticians,  358,  401  ;  tables 
of  Gamma  functions,  401  ;  poly- 
choric coefficients  of  correlation, 
390  ;  variate  difference  method, 
392. 

Peas,  applications  of  theory  of  sam- 
pling to  experiments  in  crossing;, 
267-268. 

Pecten,  correlation  between  two 
diameters  of  shell,  158 ;  con- 
stants, (qu.  3)  189. 
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Pepper,  J.,  refs.,  sampling,  399. 

Percentage,  standard  error  of,  256- 
257  ;  when  numbers  in  samples 
vary,  264-265.  See  also  Sam- 
pling of  attributes. 

Percentiles,  150-153  ;  def.,  150  ;  de- 
termination, 151-152  ;  advantages 
and  disadvantages,  152-153  ;  use 
for  unmeasured  characters,  152- 
153,  refs.,  333  ;  standard  errors 
of,  337-341  ;  correlation  between 
errors  of  sampling  in,  341-342 ; 
refs.,  154,  354-356. 

Perozzo,  L.,  ref.,  applications  of 
theory  of  probability  to  correla- 
tion of  ages  at  marriage,  314. 

Persons,  W.  M.,  refs.,  index- 
numbers,  391. 

Petals,  of  Ranunculus  bulbosus,  fre- 
quency of,  102  ;  unsuitability  of 
median  in  case  of  such  a  distribu- 
tion, 117. 

Peters,  J.,  refs.,  multiplication  table, 
358. 

Petty,  Sir  W.,  refs.,  Economic 
Writings,  6. 

Pickering,  S.  U.,  refs.,  errors  of  agri- 
cultural experiment,  401. 

Plant,  H.,  refs.,  Anivenduugen  der 
math.  Statistik  ai/f  Probkme  der 
Massenfabrikation,  404. 

Poincare,  H.,  refs.,  Calcul  des  prob- 
abilites,  361. 

Poisson,  S.  D.,  law  of  small  chances, 
368,  369  ;  refs.,  sex-ratio,  273  ; 
generally  and  applications,  394  ; 
RecJierches  sur  la  probabilite  des 
jugenients,  273,  361. 

Poppies,  stigmatic  rays  on,  fre- 
quency, 78 ;  unsuitability  of 
median  in  such  a  distribution, 
116. 

Population,  estimation  of,  between 
censuses,  125-126 ;  refs.,  130, 
253. 

Positive  classes  and  attributes,  def., 
10  ;  number  of  positive  classes, 
13  ;  sufficiency  of,  for  tabulation, 
13  ;  expression  of  other  fre- 
quencies, in  terms  of,  13-14. 

Poynting,  J.  H.,  correlation  of  fluc- 
tuations, 201  ;  refs.,  208. 

Precision,  144,  257,  304. 

Pretorius,  S.  J.,  refs.,  skew  frequency 
surfaces,  397. 


Prices,  index-numbers  of,  126  :  use 
of  geometric  mean,  126  ;  of  har- 
monic mean,  129  ;  refs.,  130-131, 
390-391. 

Principal  axes,  in  correlation,  321- 

322  ;  ref.,  333. 
Probability,  theorj'-  of.  works  on, 

refs.,  361,  404-407. 

Qltaetile  deviation.    See  Quartiles. 

Quartiles,  quartile  deviation  and 
semi-interquartile  range,  134  ; 
generally,  147-149  ;  defs.,  147  ; 
determination,  147-148  ;  ratio  of 
q.d.  to  standard-deviation,  148, 
310 ;  advantages  of  q.d.  as  a 
measure  of  dispersion,  148-149 : 
difference  between  deviations  of 
quartiles  from  median  as  measure 
of  skewness,  149-150 ;  ratio  of 
q.d.  to  median  as  measure  of  re- 
lative dispersion,  149  ;  q.d.  of 
normal  curve,  310 ;  standard 
errors,  337-341,  341-343;  refs., 
354-356. 

Quetelet,  L.  A.  J.,  refs.,  Lefires  sur  la 
theorie  des  probabilites,  272,  361. 

Random  sampling,  in  sense  of  simple 
sampling,  289. 

Range,  unsuitability  of,  as  a  measure 
of  dispersion,  133. 

Ranks,  143,  153  ;  methods  of  corre- 
lation based  on  (refs.),  333. 

Banuncuhis,  frequency  of  petals, 
102  ;  unsuitability  of  median  for 
such  distributions,  117. 

Registrar-General :  correction  or 
standardisation  of  death-rates, 
224,  refs.,  226,  392;  estimates 
of  population,  refs.,  130 ;  data 
cited  from  Reports,  32-33.  52-53, 
77,  98,  163,  197-199,  199-201, 
222,  263,  283,  284,  285-286. 

Regressions,  generallv,  175-177  ; 
def.,  175  ;  total  and  partial,  233  ; 
standard  errors  of,  352 ;  non- 
linear, 201-202,  205-206,  352; 
direct  deduction,  365-366  ;  refs., 
208-209,  391,  392,  393,  394. 

Relative  dispersion,  149. 

Reserves  and  discounts  in  American 
banks,  correlation,  162  ;  diagram, 
facing  166. 
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Rhind,  A.,  ref.,  tables  for  computing 
probable  errors,  355,  359. 

Rhodes,  E.  C,  refs.,  fitting  poly- 
nomials, 393  ;  sampling,  394, 
396  ;  law  of  error,  395  ;  sampling, 
398,  399. 

Rider,  P.  R.,  refs.,  sampling,  399, 400. 

Rietz,  H.  L.,  refs.,  frequency  dis- 
tributions, 395  ;  Handbook  of 
Mathematical  Statistics,  405  ; 
Mathematical  Statistics,  405. 

Ritchie,  F.  D.,  refs.,  agronomic  ex- 
periment, 403. 

Ritchie-Scott,  A.,  refs.,  correlation 
of  polj'choric  table,  390. 

Robinson,  G.,  refs..  Calculus  of 
Observations,  405. 

Robinson,  G.  W.,  refs.,  error  in  soil 
surveys,  402. 

Roemer,  T.,  refs.,  yield  trials,  403. 

Romanovsky,  V.,  refs.,  frequency- 
curves,  395  ;  sampling,  399. 

Ross,  Sir  R.,  refs.,  frequency-curves 
(epidemiology),  396. 

Runge,  I.,  refs.,  Aiiwendungen  der 
math.  Statistik  auf  Probleme  der 
21  a sse nfa b r i kation,  404 . 

Russell,  E.  J.,  refs.,  errors  of  agri- 
cultural experiment,  401. 

Russell,  W.  T.,  refs.,  Medical  Sta- 
tistics, 407. 

Rutherford,  E.,  ref.,  law  of  small 
chances,  273. 

Salisbury,  F.  S.,  refs.,  correlation, 
393,  394  {see  Kelley). 

Salvosa,  L.  R.,  refs.,  frequenc3'-dis- 
tributions  (tables),  395. 

Sampling,  theory  of,  generally,  254- 
355  ;  the  problem,  254-256  ;  refs., 
273,  289,  313-315,  354-356,  392, 
393,  394-401. 

—  of  attributes  :  conditions  as- 
sumed in  simple  sampling,  255- 
256,  259-262  ;  random  in  sense  of 
simple  sampling,  289  ;  standard- 
deviation  of  number  or  proportion 
of  successes  in  n  events,  256-257, 
299-300  ;  examples  from  artificial 
chance,  258-259  ;  application  to 
sex-ratio,  262-264 ;  when  num- 
bers in  samples  vary,  264-265  ; 
when  chance  of  success  or  failure 
is  small,  265-266,  366-370  ;  stan- 
dard error,  def.,  267  ;  comparing 


a  sample  with  theory,  267-268  ; 
comparing  one  sample  with  an- 
other independent  therefrom,  268- 

271  ;  comparing  one  sample  with 
another  combined  with  it,  271- 

272  ;  limitations  to  interpretation 
of  standard  error  when  n  is  small, 
inverse  interpretation,  276-279  ; 
limits  as  a  measure  of  untrust- 
worthiness,  279-281  ;  effect  of 
removing  conditions  of  simple 
sampling,  281-289  ;  sampling 
from  limited  material,  287  ;  bi- 
nomial distribution,  291-300  ;  nor- 
mal curve,  300-313  ;  normal  cor- 
relation, 317-334  ;  law  of  small 
chances,  366-370 ;  refs.,  272-273, 
393,  395,  397-401.  See  also 
Binomial  series ;  Hypergeometri- 
cal  series ;  Normal  curve ;  Cor- 
relation, normal. 

Sampling  of  variables,  conditions 
assumed  in  simple  sampling,  335- 
337  ;  standard  errors  of  percen- 
tiles (median  and  quartiles),  337- 
341  ;  dependence  of  standard 
error  of  median  on  the  form  of  the 
distribution,  338-340;  of  differ- 
ence between  two  percentiles, 
341-343  ;  of  arithmetic  mean, 
344-350  ;  of  difference  between 
two  means,  345-346  ;  normality 
of  distribution  of  mean,  346-347  ; 
effect  of  removing  conditions  of 
simple  sampling  on  standard  error 
of  mean,  347-350  ;  standard  error 
of  standard  -  deviation  and  co- 
efficient of  variation,  351  ;  of  co- 
efficients of  correlation  and  re- 
gression, 352  ;  of  correlation-ratio 
and  test  for  linearity  of  regression, 
352  ;  refs.,  354-356,  397-401. 

Sanders,  H.  G.,  refs.,  uniformitv 
trials,  403. 

Saunders,  Miss  E.  R.,  data  cited 
from,  37. 

Scale-reading,  bias  in,  362-363. 

Scarborough,  J.  B.,  refs..  Numerical 
Mathematical  Analysis,  406. 

Scheibner,  W.,  difference  between 
arithmetic  and  geometric,  arith- 
metic and  harmonic  means,  (qu.  8 
and  qu.  9)  156. 

Scripture,  E.  W.,  use  of  word 
"  statistics,"  3. 
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Secrist,  H.,  refs.,  Introduction  to 

Statistical  Methods,  405. 
Semi-interquartile  range.   See  Quar- 

tiles. 

Sex -ratio  of  births  :  correlation  with 
total  births,  163,  175,  207  ;  dia- 
gram, 176  ;  constants,  (qu.  3)  189  ; 
application  of  the  theory  of  sam- 
pling to,  262-264,  (qu.  7)  275,  (qu. 
1,  2)  289,  refs.,  273;  standard 
error  of  ratio  of  male  to  female 
births,  (qu.  11)  275. 

Shakespeare,  W.,  use  of  word 
"  statist,"  1. 

Sheppard,  W.  F.,  correction  of  the 
standard-deviation  for  grouping, 
212,  307  ;  theorem  on  correlation 
of  a  normal  distribution  grouped 
round  medians,  (qu.  4)  334 ; 
normal  curve  tables,  337  ;  stan- 
dard errors  of  percentiles,  344. 
Refs.,  calculation  and  correction 
of  moments,  225  ;  normal  curve 
and  correlation,  theory  of  sam- 
pling, 314,  333,  355;  tables  of 
normal  function  and  its  integral, 
359  ;  goodness  of  fit,  397. 

Shewhart,  W.  A.,  refs.,  Engineering 
Applications  of  Statistical  Method, 
404. 

Shohat,  J.   (Chokhate,   J.),  refs., 

sampling,  399. 
Significant  differences,  266. 
Simpson,  T.  Wake,  refs.,  yield  trials, 

403. 

Sinclair,  Sir  John,  use  of  words 

"  statistics,"  "  statistical,"  2. 
Sipos,  A.,  refs.,  time  series,  393. 
Skew  or  asymmetrical  frequencj^- 

distributions,  90-102.     See  also 

Frequency-distributions. 
Skewness  of  frequency-distributions, 

107  ;  measures  of,  149-150. 
Slutsky,  E.,  refs.,  fit  of  regression 

lines,  209,  391. 
Small  chances,  law  of,  265-266,  366- 

370  ;  refs.,  273,  394. 
Smith,  B.  B.,  refs.,  time  correlation, 

393. 

Smith,  C.  D.,  refs.,  Tchebycheflf  in- 
equalities, 400. 

Snow,  E.  C,  refs.,  estimates  of  popu- 
lation, 130,  253  ;  lines  and  planes 
of  closest  fit,  209. 

Soil  surveys,  errors  in,  refs.,  402. 


Soper,  H.  E.,  refs.,  probable  error 

of  correlation  coefficient,  355, 
397,  399  ;  of  biserial  expression 
for  correlation  -  coefficient,  355  ; 
Frequency  Arrays,  395  ;  sampling, 
397,  399,  400;  tables  of  ex- 
ponential binomial  limit,  273. 
Southey,  Robert,  cited  re  Cosin's 
Names  of  the  Roman  Catholics^ 
etc.,  100. 

Spearman,  C,  effect  of  errors  of 
observation  on  the  standard- 
deviation  and  coefficient  of  corre- 
lation, 213-214.  Refs.,  effect  of 
errors  of  observation,  225,  333, 
392  ;  rank  method  of  correlation, 
333,  397. 

Splawa-Neyman,  J.,  refs.,  probable 
errors,  398. 

Spurious  correlation  of  indices,  215- 
216  ;  refs.,  226,  392. 

Standard-deviation.  See  Deviation, 
standard. 

Standardisation  of  death-rates,  223- 
225  ;  refs.,  226,  392. 

Statist,  occurrence  of  the  word  in 
Shakespeare  and  in  Milton,  1. 

Statistical,  introduction  and  de- 
velopment in  the  meaning  of  the 
word,  1-5 ;  S.  Account  of  Scotlaivd, 
2  ;  Royal  S.  Society,  3  ;  methods, 
purport  of,  3-5  ;  def.,  5. 

Statistics,  introduction  and  develop- 
ment in  meaning  of  word,  1-5  ; 
def.,  5  ;  theory  of,  def.,  5. 

Statures  of  males  in  U.K.,  tables,  88, 
90 ;  diagrams,  89,  91  ;  calcula- 
tion of  mean,  112;  means  and 
medians,  117,  (qu.  1)  131  ;  stan- 
dard-deviation, 141  ;  percentiles, 
153 ;  standard-deviation,  mean 
deviation,  and  quartiles,  (qu.  1) 
155  ;  distribution  fitted  to  normal 
curve,  305-306,  307-308 ;  dia- 
gram, 306 ;  standard  errors  of 
mean  and  median,  of  first  to 
ninth  deciles,  341,  343,  344- 
345  ;  of  standard-deviation  and 
semi-interquartile  range,  (qu.  5) 
355. 

—  correlation  of,  for  father  and 
son,  160  ;  diagrams,  facing  166, 
174;  constants,  (qu.  3)  189; 
testing  for  normality,  322-328  ; 
for  isotropy,  329-331  ;  diagram 
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of  diagonal  distribution,  325,  of 
fitted  contour  lines,  327. 
Stead,  H.  G.,  correlation-coefficients, 
ref.,  392. 

Steffensen,  J.  F.,  refs.,  Recent  Re- 
searches, 407. 

Stevenson,  T.  H.  C,  refs.,  birth- 
rates, correction  of,  for  age- 
distribution,  226. 

Stigmatic  rays  on  poppies,  fre- 
quency, 78  ;  unsuitability  of 
median  for  such  distributions,  116. 

Stirling,  James,  expression  for  fac- 
torials of  large  numbers,  304. 

Stoessiger,  B.,  refs.,  probability 
integrals  for  small  samples,  401. 

Stratton,  F.  J.  M.,  refs.,  errors  of 
agricultural  experiment,  402. 

"Student"  (pseudonym),  refs.,  law 
of  small  chances,  273,  394  ;  prob- 
able errors,  355,  397,  398  (under 
Fisher,  R.  A.) ;  deviations  from 
Poisson's  Law,  394 ;  probable 
errors  of  Spearman's  correlation- 
coefficients,  397  ;  method  of 
cereal  testing,  402. 

Surface,  F.  M.,  refs.,  errors  in  variety 
tests,  402. 

Symmetrical  frequency  -  distribu- 
tions, 87-90.  See  also  Frequency- 
distributions  ;  Normal  curve. 

Symons,  G.  J.,  use  of  word  "  sta- 
tistics "  in  British  Rainfall,  3. 

Tables,  calculating,  of  functions, 
etc.,  refs.,  357-359,  401  ;  see  also 
under  subject-headings. 

Tabulation,  of  statistics  of  attri- 
butes, 11-14,  37  ;  of  a  frequency- 
distribution,  81-83  ;  of  a  correla- 
tion table,  164. 

Tappan,  M.,  refs.,  partial  correlation, 
394. 

Tathara,  John,  refs.,  standardisation 

of  death-rates,  226. 
Tchebycheff,  P.  L.    See  Chebysheff . 
Tedin,  0.,  refs.,  yield  trials,  404. 
Thiele,  T.  N.,  refs..  The  Theortj  of 

Observations,  394. 
Thomson,  G.  H.,  refs.,  The  Essentials 

of  Mental  Measurement,  404. 
Thorndike,   E.  L.,  refs.,  methods 

of   measuring   correlation,    333  ; 

Theory    of    Mental    and  Social 

Measurements,  361. 


Time-correlation  problem,  197-201  ; 
refs.,  208-209,  392-393. 

Tippett,  L.  H.  C,  refs.,  extremes  of 
sample,  399 ;  The  Methods  of 
Statistics,  407. 

Tocher,  J.  F.,  refs.,  contingency,  390. 

Todhunter,  I.,  refs..  History  of  the 
Mathematical  Theory  of  Proba- 
bility, 6. 

Trachtenberg,  M.  I.,  refs.,  property 

of  median,  154. 
Trought,  Trevor,  refs.,  cotton  trials, 

402. 

TschebyshefiF,  P.  L.  See  Chebysheff. 
Tschuprow,    A.    A.,   refs.,  partial 

correlation,   394 ;  mathematical 

expectation   of   moments,    397  ; 

distribution  of  means,  398 ;  Korre- 

lations-theorie,  405. 
Type  of  array,  def.,  164. 

Ultimate  classes  and  frequencies, 
def.,  12  ;  sufficiency  of,  for  tabu- 
lation, 12-13. 

Universe,  def.,  17  ;  specification  of, 
17,  18. 

U-shaped  frequency  -  distributions, 
102-105. 

Value,  annual,  of  dwelling-houses, 

table,  83  ;   median,  (qu.  4)  131  ; 

quartiles,  (qu.  3)  155. 
—  of  estates  in  1715,  table,  100  ; 

diagram,  101. 
Variables,  theory  of,  generally,  75- 

253  ;  def.,  7,  75. 
Variance,  for  square  of  standard 

deviation,  144  ;  refs.,  analysis  of 

variance,  Irwin,  394 ;  Tippett, 

407. 

Variates,  def.,  150. 

Variation,  coefficient  of,  149  ;  stan- 
dard error  of,  351,  352. 

Variety  trials,  errors  in,  refs.,  401- 
404. 

Venn,  John,  refs..  Logic  of  Chance, 

sex-ratio,  273,  361. 
Verschaeft'elt,  E.,  relative  dispersion, 

149.     Refs.,  measure  of  relative 

dispersion,  154. 
Vigor,  H.  D,,  data  cited  from,  163. 

Refs.,  sex-ratio,  273. 
Vik,  Knut,  refs.,  yield  trials,  (under 

Behrens)  403. 
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Wages  of  agricultural  labourers. 

See  Earnings. 
Wages,  real,  refs.,  390-391. 
Walker,  Helen  M.,  refs..  History  of 

Statistical  Method,  390. 
Warner,  F.,  refs.,  study  of  defects  in 

school  children,  notation  for  sta- 
tistics of  attributes,  15. 
Water  analysis,  methods,  refs.,  394. 
Waters,    A.    C,    refs.,  estimating 

intercensal  populations,  130. 
Weather  and  crops,  correlation,  196- 

197  ;  refs.,  208. 
Weight  of  males  in  U.K.,  table,  95 ; 

diagram,  94  ;  mean,  median,  and 

mode,    (qu.    2)    131  ;  standard 

deviation,  mean  deviation,  and 

quartiles,  (qu.  2)  155. 
Weighted      mean.      See  Mean, 

weighted  ;  also  Mean,  geometric  ; 

Median  ;  Mode. 
Weldon,  W.  F.  R.,  dice-throwing 

experiments,  258-259,  373-376. 
West,  C.  J.,  refs..  Introduction  to 

Mathematical  Statistics,  405. 
Westergaard,  H.,  refs.,  Theorie  der 

Statistik,  6,  273,  361,  406. 
Whipple,  G.  C.,  refs..  Vital  Statistics, 

406. 

Whi taker,  Lucy,  ref.,  law  of  small 

numbers,  273. 
Whittaker,  E.  T.,  refs..  Calculus  of 

Observations,  405. 
Wicksell,  S.  D.,  refs.,  correlation, 

391  392. 

Will,  H.  S.,  refs.,  curve-fitting,  393. 
Willcox,  W.  F.,  citation  of  Bielfeld, 
1. 

Winkler,  Wilhelm,  refs.,  Grundriss 
der  Statistik,  407. 

Wishart,  John,  refs.,  sampling,  399, 
400,  401  ;  agricultural  experi- 
ment, 403,  404. 

Wolfenden,  H.  H.,  ref.,  mortalities 
and  death-rates,  392. 

Woo,  T.  L.,  refs.,  sampling,  400. 

Wood,  Frances,  refs.,  index-correla- 
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tions,  226,  252 ;  index-numbers, 
390. 

Wood,  T.  B.,  refs.,  errors  of  agricul- 
tural experiment,  401,  402  ;  varia- 
tion in  mangels,  402. 

Woods,  Hilda  M.,  refs.,  Medical 
Statistics,  407. 

Working,  H.,  refs.,  time  series,  393. 

Working  classes,  cost  of  living,  refs., 
390-391. 

Yield  trials,  refs.,  401-404. 
Young,  AUyn  A.,  refs.,  age  statis- 
tics, 105. 

Young,  Andrew,  refs.,  probable  error 
of  coefficient  of  contingency,  397. 

Yule,  G.  U.,  use  of  term  character- 
istic lines  (lines  of  regression),  177 ; 
problem  of  pauperism,  192  ;  data 
cited  from,  78,  93,  122,  140,  163, 
185  ;  facing  186,  259,  385.  Refs., 
history  of  words  "  statistics," 
"  statistical,"  5  ;  attributes,  asso- 
ciation, consistence,  etc.,  15,  23, 
39,  40,  57  ;  isotropy,  influence  of 
bias  in  statistics  of  qualities,  73  ; 
correlation,  188,  226,  252,  392  ; 
correlation  between  indices,  226  ; 
frequency-curves,  314,  396  ;  prob- 
able errors,  355,  396  ;  pauperism, 
130,  208,  253;  birth-rates,  208, 
226  ;  sex-ratio,  273  ;  fluctuations 
of  sampling  in  Mendelian  ratios, 
273 ;  time-correlation  problem, 
392  ;  application  of  law  of  small 
chances,  394  ;  goodness  of  fit  in 
association  and  contingency 
tables,  396  ;  yield  trials,  402. 

ZiMMERMANN,  E.  A.  W.,  use  of  the 

words  "  statistics,"  "  statistical," 

in  English,  1. 
Zimmermann,    H.,  multiplication 

table,  358. 
Zizek,  F.,  refs..  Die  statistischen  Mit- 

telwerthe  and  translation,  129. 
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